Are Truthful Bidders Paying too Much? Efficiency and Revenue in Display Ad Auctions
Ontology alignment is a fundamental task to reconcile the heterogeneity among various information systems using distinct information sources. The evolutionary algorithms (EAs) have been already considered as the primary strategy to develop an ontology alignment system. However, such systems have two significant drawbacks: they either need a ground truth which is often unavailable, or they utilize the population-based EAs in a way that they require massive computation and memory. This article presents a new ontology alignment system, called SANOM, which uses the well-known simulated annealing as the principal technique to find the mappings between two given ontologies while no ground truth is available. In contrast to population-based EAs, the simulated annealing need not generate populations, which makes it significantly swift and memory-efficient for the ontology alignment problem. The paper models the ontology alignment problem as optimizing the fitness of a state whose optimum obtains using the simulated annealing. A complex fitness function is developed which takes advantages of various similarity metrics including string, linguistic, and structural similarities. A randomized warm initialization is specially tailored for the simulated annealing in order to expedite its convergence. The experiments illustrate that SANOM is competitive with the state-of-the-art, and is significantly superior to other EA-based systems.
The Centers for Medicare & Medicaid Services (CMS) launched its nursing home rating system in 2008, which has been widely used among patients, doctors and insurance companies. The system rates nursing homes based on a combination of CMSs inspection results and nursing homes self-reported measures. Prior research has shown that the rating system is subject to inflation in the self-reporting procedure, leading to biased overall ratings. Given the limited resources CMS has, it is important to optimize the inspection process and develop an effective audit process to detect and deter inflation. In this paper, we first examine if the domain that CMS currently inspects is the best choice in terms of reducing the percentage of nursing homes that can inflate and reducing the difficulty of detecting such inflators. We develop a novel graph-based approach and use publicly available data on nursing home ratings to show that CMSs current choice of inspection domain is not optimal if it intends to minimize the number of potential inflators, and CMS will be better off if it inspects the staffing domain instead. We also show that CMSs current choice of inspection domain is only optimal had there been an audit system in place to complement it. We then design an auditing system for CMS which will be coupled with its current inspection strategy. We analyze the performance of the audit system in terms of net audit budget and audit efficiency. To design the audit system, we consider nursing homes reactions to different audit policies, and conduct a detailed simulation study on the optimal audit parameter settings. Our result suggests that CMS should use a moderate audit policy in order to carefully balance the tradeoff between audit net budget and audit efficiency.
This paper presents a name disambiguation approach to resolve ambiguities between person names and group web pages according to the individuals they refer to. The proposed approach exploits two important sources of entity-centric semantic information extracted from web pages including personal attributes and social relationships. It takes as input the web pages that are results for a person name search. The web pages are analysed to extract personal attributes and social relationships. The personal attributes and social relations are mapped into an undirected weighted graph, called attribute-relationship graph. A graph-based clustering algorithm is proposed in order to group the nodes representing the web pages, each of which refers to a person entity. The outcome is a set of clusters such that the web pages within each cluster refer to the same person. We show the effectiveness of our approach by evaluating it on large-scale datasets WePS-1, WePS-2 and WePS-3. Experimental results are encouraging and show that the proposed method clearly outperforms several baseline methods and also its counterparts.