Vous souhaite de joyeuses Fêtes de ﬁn d'année et tous ses voeux pour 2006 Journal édité par la commune de Meyrin en collaboration avec l'Association des Habitants de la Ville de Meyrin et le Cartel des sociétés meyrinoises Décembre 2005 No70 Meyrin Ensemble, Case postale 89, 1217 Meyrin 1 Gastronomie et sciencePour Noël, la gastrophysique dévoile ses cartes
Staffdcs.shef.ac.ukEvaluating Semantic Search Query Approaches
with Expert and Casual Users
Khadija Elbedweihy, Stuart N. Wrigley, and Fabio Ciravegna Department of Computer Science, University of Sheﬃeld, UK Abstract. Usability and user satisfaction are of paramount importance
when designing interactive software solutions. Furthermore, the optimal
design can be dependent not only on the task but also on the type of
user. Evaluations can shed light on these issues; however, very few stud-
ies have focused on assessing the usability of semantic search systems.
As semantic search becomes mainstream, there is growing need for stan-
dardised, comprehensive evaluation frameworks. In this study, we assess
the usability and user satisfaction of diﬀerent semantic search query in-
put approaches (natural language and view-based) from the perspective
of diﬀerent user types (experts and casuals). Contrary to previous stud-
ies, we found that casual users preferred the form-based query approach
whereas expert users found the graph-based to be the most intuitive.
Additionally, the controlled-language model oﬀered the most support for
casual users but was perceived as restrictive by experts, thus limiting
their ability to express their information needs.
Semantic Web search engines (e.g. Sindice ) oﬀer gateways to locate SemanticWeb documents and ontologies; ontology-based natural language interfaces (e.g.
NLP-Reduce ) and visual query approaches (e.g. Semantic Crystal ) allowmore user-friendly querying; while others try to provide the same support buton the open Web of Data . These search approaches require and employdiﬀerent query languages. Free-NL provides high expressiveness by allowing usersto input queries using their own terms (keywords or full sentences). Controlled-NL provides support during query formulation through suggestions of valid queryterms found in the underlying – restrictive – vocabulary.
Finally, view-based (graphs and forms) approaches aim to provide the most support to users by visualising the search space in order to help them understandthe available data and the possible queries that can be formulated.
Evaluation of software systems – including user interfaces – has been ac- knowledged in literature as a critical necessity Indeed, large-scale evalua-tions foster research and development by identifying gaps in current approachesand suggesting areas for improvements and future work. Following the Cranﬁeld This work was partially supported by the European Union 7th FWP ICT based e- Infrastructures Project SEALS (Semantic Evaluation at Large Scale, FP7-238975).
e-Mauroux et al. (Eds.): ISWC 2012, Part II, LNCS 7650, pp. 2012.
Springer-Verlag Berlin Heidelberg 2012 Evaluating Semantic Search Query Approaches model – using a test collection, a set of tasks and relevance judgments –and using standard evaluation measures such as precision and recall has beenthe dominant approach in IR evaluations, led by TREC . This approach hasnot been without criticisms and there have been long-standing calls forassessing the interactive aspect as well .
In an attempt to address these issues, more studies have been conducted with a focus on Interactive Information Retrieval (IIR). The ones embodied withinTREC (Interactive Track and Complex Interactive Question-Answering )involved real users to create topics or evaluate documents rather than to assessusability and usefulness of the IR systems. Others investigated users perceptionof ease-of-use and user control with respect to the eﬀectiveness of the retrievalprocess or studied the impact and use of cross-language retrieval systems .
With respect to the type of users involved in these studies, some have optedto further diﬀerentiate between casual users and expert users. In the context ofthese works and indeed in ours, casual users refer to those with very little or noknowledge in a speciﬁc ﬁeld (e.g., Semantic Web, for our study), while expert usershave more knowledge and experience in that ﬁeld.
Inheriting IR's evaluation paradigm, Semantic Search evaluation eﬀorts have been largely performance-oriented with a limited attention to the user-related aspects . Kaufmann and Bernstein conducted a within-subjects(same group of subjects evaluate all the participating tools) evaluation of fourtools adopting NL- and graph-based approaches with 48 casual users while theevaluation described in featured NL- and form-based tools.
The evaluation described here is diﬀerent in the following ways: 1) broader range of query approaches (in contrast to ), 2) all tools are evaluatedwithin-subjects (in contrast to and 3) equal-sized subjects groups for casualand expert users (in contrast to ). These diﬀerences are important andallow novel analyses to be conducted since it facilitates direct comparison of theevaluated approaches and a ﬁrst-time understanding and comparison of how thetwo types of users perceive the usability of these approaches. Although some IIRstudies involved casual and expert users, most of these focused on investigatingdiﬀerences in the search behaviour and strategies The remainder of the paper is organized as follows: ﬁrst, the usability study is described. Next, the results and analyses are discussed together with the mainconclusions and ﬁnally, the limitations are pointed out with planned future work.
The underlying question of the research presented in this paper is how usersperceive the usability of diﬀerent semantic search approaches (speciﬁcally sup-port in query formulation and suitability of results returned), and whether thisperception is diﬀerent between expert and casual users. To answer the question,ten casual users and ten expert users were asked to perform ﬁve search taskswith ﬁve tools adopting NL-based and view-based query approaches. These areuser-centric semantic search tools (e.g. query given as natural language or using K. Elbedweihy, S.N. Wrigley, and F. Ciravegna a form or a graph) querying a repository of semantic data and returning answersextracted from them. The results returned must be answers rather than docu-ments; however they are not limited to a speciﬁc style (e.g., list of entity URIsor visualised results). Experiment results such as query input time, success ratesand input of questionnaires are recorded. These results are quantitatively andqualitatively analysed to assess tools' usability and user satisfaction.
Dataset and Questions
The main requirement for the dataset is to be from a simple and understandabledomain for users to be able to formulate the given questions into the tools' querylanguages. Hence, the geography dataset within the Mooney Natural LanguageLearning Dawas selected. It contained predeﬁned English language questionsand has been used by other related studies . The ﬁve evaluation questions(given below) were chosen to range from simple to complex ones and to testtools' ability in supporting speciﬁc features such as comparison or negation.
1. Give me all the capitals of the USA? This is the simplest question: consisting of only one ontology concept: ‘cap-ital ' and one relation between this concept and the given instance: USA.
2. What are the cities in states through which the Mississippi runs? This question contains two concepts: ‘city' and ‘state' and two relations: onebetween the two concepts and one linking state with Mississippi.
3. Which states have a city named Columbia with a city population over 50,000? This question features comparison for a datatype property city populationand a speciﬁc value (50,000).
4. Which lakes are in the state with the highest point? This question tests the ability for supporting superlatives (highest point ).
5. Tell me which rivers do not traverse the state with the capital Nashville? Negation is a traditionally challenging feature for semantic search .
Twenty subjects were recruited for the evaluation; ten of these subjects werecasual users and ten were expert users. The 20 subjects (12 females, 8 males)were aged between 19–46 with a mean of 30 years. The experiment followed awithin-subjects design to allow direct comparison between the evaluated queryapproaches. Additionally, with this design, usually less participants are requiredto get statistically signiﬁcant results All 20 subjects evaluated the ﬁve toolsin randomised order to avoid any learning, tiredness or frustration eﬀects thatcould inﬂuence the experiment results. Furthermore, to avoid any possible biasintroduced by developers evaluating their own tools, only one test leader – whois also not the developer of any of the tools – was responsible for running thewhole experiment.
Evaluating Semantic Search Query Approaches For each tool, subjects were given a short demo session explaining how to use it to formulate queries. After that, subjects were asked to formulate each of theﬁve questions in turn using the tool's interface. The order of the questions wasrandomised for each tool to avoid any learning eﬀects. After testing each tool,subjects were asked to ﬁll in two questionnaires.
Finally, we collected demographics data such as age, profession and knowledge of linguistics (see for details of all three questionnaires). Each experimentwith one user took between 60 to 90 minutes.
In assessing usability of user-interfaces, several measurements including time required to perform tasks, success rate and perceived user satisfaction were pro-posed in the literature of IIR and HCI .
Similar to these studies and indeed to allow for deeper analysis, we collected both objective and subjective data covering the experiment results. The ﬁrstincluded: 1) input time required by users to formulate their queries, 2) numberof attempts showing how many times on average users reformulated their queryto obtain answers with which they were satisﬁed (or indicated that they wereconﬁdent a suitable answer could not be found), and 3) answer found rate captur-ing the distinction between ﬁnding the appropriate answer and the user ‘givingup' after a number of attempts. This data was collected using custom-writtensoftware which allowed each experiment run to be orchestrated.
Additionally, subjective data was collected using think-aloud strategy and two post-search questionnaires. The ﬁrst is the System Usability Scale (SUS)questionnaire , a standardised usability test consisting of ten normalisedquestions covering aspects such as the need for support, training, and complexityand has proven to be very useful when investigating interface usability The second questionnaire (Extended Questionnaire) is one which we designed tocapture further aspects such as the user's satisfaction with respect to the tool'squery language and the content returned in the results as well as how it waspresented. After completing the experiment, subjects were asked to rank the toolsaccording to four diﬀerent criteria (each one separately): how much they likedthe tools (Tool Rank ); how much they liked their query interfaces: graph-based,form-based, free-NL and controlled-NL (Query Interface Rank ); how much theyfound the results to be informative and suﬃcient (Results Content Rank ); andﬁnally how much they liked the results presentation (Results Presentation Rank ).
Note that users were allowed to give equal rankings for multiple tools if they hadno preference for one over the other. To facilitate comparison, for each criterion,ranking given by all users for one tool was summed and subsequent score wasthen normalised to have ranges between 0 and 1 (where 1 is the highest).
Results and Discussion
Evaluated tools included free-NL- (NLP-Reduce ), controlled-NL- (Ginseng ),form- (K-Search ), and ﬁnally graph- based (Semantic-Crystal and Aﬀec-tive Graapproaches. Results for both expert and casual users are presented K. Elbedweihy, S.N. Wrigley, and F. Ciravegna in Tables and respectively. In these tables, a number of diﬀerent factors arereported such as the SUS scores and the tools' rankings. We also include thescores from two of the most relevant questions from the extended questionnaire.
EQ1: liked presentation shows the average response to the question "I liked thepresentation of the answers", while EQ2: query language easy shows it for thequestion "The system's query language was easy to use and understand".
Note that in the rest of this section, we use the term tool (e.g. graph-based tools) to refer to the implemented tool as a full semantic search system (withrespect to its query interface and approach, functionalities, results presentation,etc.) and the term query approach (e.g. graph-based query approach) to speciﬁ-cally refer to the style of query input adopted.
To quantitatively analyse the results, SPwas used to produce averages, perform correlation analysis and check the statistical signiﬁcance. The median(as opposed to the mean) was used throughout the analysis since it was foundto be less susceptible to outliers or extreme values sometimes found in the data.
In the qualitative analysis, the open coding technique was used in whichthe data was categorised and labelled according to several aspects dominated byusability of the tools' query approaches and returned answers.
Expert User Results
According to the adjective ratings introduced by , Ginseng – with the lowestSUS score – is classiﬁed as Poor, NLP-Reduce as Poor to OK, K-Search andSemantic Crystal are both classiﬁed as OK, while Aﬀective Graphs, which man-aged to get the highest average SUS score, is classiﬁed as Good. These results arealso conﬁrmed by the tools' ranks (see Table Aﬀective Graphs was selected60% of the times as the most-liked tool and thus got the highest rank (0.875), fol-lowed by Semantic Crystal and K-Search (0.625 and 0.6 respectively) and ﬁnallyGinseng and NLP-Reduce got a very low rank (0.225) with each being chosen asthe least-liked tools four times and twice, respectively. Since the rankings are aninherently relative measure, they allow for direct tool-to-tool comparisons to bemade. Such comparisons using the SUS questionnaire may be less reliable sincethe questionnaire is completed after each tool's experiment (and thus temporallyspaced) with no direct frame of reference to any of the other tools.
Table also shows that Aﬀective Graphs, which is most liked and found to be the most intuitive by users managed to get satisfactory answers for 80% of the
queries, followed by K-Search (50%) which is employing the second most-liked
query approach. Finally, it was found that all the participating tools did not
support negation (except partially by Aﬀective Graphs). This was conﬁrmed by
the answer found rate for the question "Tell me which rivers do not traverse the
state with the capital nashville? " being: Aﬀective Graphs: 0.4, Semantic Crystal:
0.1, K-Search: 0.1, Ginseng: 0.1, NLP-Reduce: 0.0.
Expert Users Prefer Graph- and Form- Based Approaches: Results
showed that graph- and form- based approaches were the most liked by expert
Evaluating Semantic Search Query Approaches Table 1. Tools results for expert users. Non-ranked scores are median values; bold
values indicate best performing tool in that category.
Query Language Rank (0-1) Results Content Rank (0-1) Results Presentation Rank (0-1) 0.875 EQ1: liked presentation (0-5) EQ2: query language easy (0-5) 4
Number of Attempts Answer Found Rate (0-1) users. However, in terms of overall satisfaction (see SUS scores and Tool Rank inTable graph-based tools outperformed the form- and NL- based ones. Addi-tionally, feedback showed that users were able to formulate more complex querieswith the view-based approaches (graphs and forms) than with the NL ones (freeand controlled). Indeed, the ability to visualise the search space provides an un-derstanding of the available data (concepts) as well as connections found betweenthem (relations) which shows how they can be used together in a query It is interesting to note that although Aﬀective Graphs and Semantic Crystal both employ graph-based query approach, users had diﬀerent perceptions of theirusability. More users gave the query interface of Aﬀective Graphs higher scoresthan Semantic Crystal (quartiles: "3.75 , 5" and "2 , 4.25" respectively) sincethey found it to be more intuitive. The most repeated (60%) positive commentgiven for Aﬀective Graphs was "the query interface is intuitive and easy/pleasantto use". This is a surprising outcome since graph-based approaches are knownto be complicated and laborious . However, this has not been explicitlyassessed from expert users perspective in any similar studies.
An important diﬀerence was observed between the two graph-based tools: Se- mantic Crystal visualizing the entire ontology whereas Aﬀective Graphs optedfor showing concepts and relations only selected by the users (see Fig. Al-though feedback showed that users preferred the ﬁrst approach, it imposes alimitation on how much can be displayed in the visualisation window. With asmall ontology, the graph is clear and can be easily explored; as the ontology getsbigger, the view would easily get cluttered with concepts and links showing re-lations between them. This would negatively aﬀect the usability of the interfaceand in turn the user experience.
Expert Users Frustrated by Controlled-NL: Although the guidance pro-
vided by the controlled-NL approach was at sometimes appreciated, restricting
expert users to the tool's vocabulary was more annoying. This resulted in an un-
satisfying experience (lowest SUS score of 32.5 and least liked interface) which
is supported by the most repeated negative comments given for Ginseng:
– It is frustrating when you cannot construct queries in the way you want.
– You need to know in advance the vocabulary to be able to use the system.
K. Elbedweihy, S.N. Wrigley, and F. Ciravegna Table 2. Tools results for casual users. Non-ranked scores are median values; bold
values indicate best performing tool in that category.
Query Language Rank (0-1) Results Content Rank (0-1) Results Presentation Rank (0-1) 0.775 EQ1: liked presentation (0-5) EQ2: query language easy (0-5) 4
Number of Attempts Answer Found Rate (0-1) The second comment is in stark contrast to what the controlled-NL approachis designed to provide. It is intended to help users formulate their queries with-out having to know the underlying vocabulary. However, even with the guid-ance, users frequently got stuck because they did not know how to associate thesuggested concepts, relations or instances together. This is conﬁrmed by usersrequiring the longest input time when using Ginseng (Table : Input Time).
Casual User Results
Graph-Based Tools More Complex If Entire Ontology Not Shown:
Recall in Section expert users preferred the approach of visualising the
entire ontology (adopted by Semantic Crystal as shown in Fig. This was
indeed more appreciated by casual users, resulting in Semantic Crystal receiving
higher scores. Surprisingly, the lack of this feature caused Aﬀective Graphs to
be perceived by casual users as the most complex and diﬃcult to use: 50% of
the users found it to be: "less intuitive and has higher learning curve than NL".
Tool Interface Aesthetics Important to Casual Users: Most of the ca-
sual users (70%) liked the interface of Aﬀective Graphs for having an animated,
modern and visually-appealing design. This not only created a pleasant search
experience but was also helpful during query formulation (e.g., highlighting se-
lected concepts) and in turn balanced the negative eﬀect of not showing the
entire ontology, resulting in high user satisfaction (second highest SUS score:
Casual Users Prefer Form-Based Approach: Casual users needed less in-
put time with the form-based approach and found it less complicated than the
graph-based approach while allowing more complex queries than the NL-based
ones. However, unexpectedly, more attempts were required to formulate their
queries using this approach. The presence of inverse relations in the ontology
was viewed by casual users as unnecessary redundancy. This impression led to
confusion and thus required more trials to formulate the right queries. For in-
stance, to query for the rivers running through a certain state, two alternatives
("State, hasRiver, River" and "River, runsthrough, State") were adopted by
Evaluating Semantic Search Query Approaches users. Tools ought to take the burden oﬀ users and provide one unique way toformulate a single query.
Casual Users Liked Controlled-NL Support: Casual users found the guid-
ance oﬀered by suggesting valid query terms very helpful and provided them
with more conﬁdence in their queries. Interestingly, they preferred to be ‘con-
trolled' by the language model (allowing only valid queries) rather than having
more expressiveness (provided by free-NL) while creating more invalid queries.
(a) Semantic Crystal (b) Aﬀective Graphs Fig. 1. Diﬀerent visualizations of the Mooney ontology by the tools
Results Independent of User Type
This section discusses results and ﬁndings common to both types of users.
Form-Based Faster But More Tedious Than Graph-Based: Results showed
that both types of users took less time to formulate their queries with the form-
based approach than with the graph-based ones (approximate diﬀerence: 36% for
experts, 14% for casuals). However, it was found to be more laborious to use than
graphs especially when users had to inspect the concepts and properties (pre-
sented in a tree-like structure) to select the required ones for the query (see Fig. This is a challenge acknowledged in the literature for form-based approaches
and is supported by the feedback given by users: the most repeated negative com-
ment was "It was hard to find what I was looking for once a number of items in the
tree are expanded ". Additionally, this outcome suggests that input time cannot be
used as the sole metric to inform usability of query approaches.
Free-NL Simplest and Most Natural; Suﬀer from Habitability Prob-
lem: The free-NL approach was appreciated by users for being the most simple
and natural to them. However, the results showed a frequent mismatch between
users' query terms and the ones expected by the tool. This is caused by the
abstraction of the search space and is known in literature as the habitability
problem p.2]. This is supported by the users' most repeated negative com-
ment: "I have to guess the right words". They found that they could get answers
K. Elbedweihy, S.N. Wrigley, and F. Ciravegna with speciﬁc query terms rather than others. For instance, using ‘run through'with ‘river' returns answers which are not given when using ‘traverse'. This isalso conﬁrmed by the tool (NLP-Reduce) getting the lowest success rate (20%).
Furthermore, requiring the highest number of attempts (4.1) support users' feed-back that they had to rephrase their queries to ﬁnd the combination of wordsthe tool is expecting. Indeed, this is a general challenge facing natural languageinterfaces .
Results Content and Presentation Aﬀected Usability and Satisfaction:
When evaluating semantic search tools, it is important – besides evaluating per-
formance and usability – to assess the usefulness of the information returned
as well as how it is presented. Within this context, our study found that the
results presentation style employed by K-Search was the most liked by all users
as shown in Tables and It is interesting to note how small details such as
organising answers in a table or having a visually-appealing display (adopted by
K-Search) have a direct impact on results readability and clarity and, in turn,
user satisfaction. This is shown from the most repeated comments given for K-
Search: "I liked the way answers are displayed " and "results presentation was
easy to interpret ". Additionally, K-Search is the only tool that did not present
a URI for an answer but used a reference to the document using a NL label.
This was favoured by users who often found URIs to be technical and more
targeted towards domain experts. For instance, one user speciﬁcally mentioned
having "http://www.mooney.net/geo#tennesse2 " as an answer was not under-
standable. By examining the ontology, this was found to be the URI of tennessee
river and it had the ‘2' at the end to diﬀerentiate it from tennessee state, which
had the URI "http://www.mooney.net/geo#tennesse". This suggests that, un-
less users are very familiar with the data, presenting URIs alone is not very
helpful. By analysing users feedback from a similar usability study, Elbedweihy
et al. found that when returning answers to users, each result should be aug-
mented with associated information to provide a ‘richer' user experience. This
was similarly shown by users' feedback in our study with the following comments
regarding potential improvements often given for all the tools:
– Maybe a ‘mouse over' function with the results that show more information.
– Perhaps related information with the results.
– Providing similar searches would have been helpful.
For example, for a query requiring information about states, tools could go astep further and return extra information about each state – rather than onlyproviding name and URI – such as the capital, area, population or density, amongothers. Furthermore, they could augment the results with ones associated withrelated concepts which might be of interest to users . Again, these couldbe instances of lakes or mountains (examples of concepts related to state) foundin a state. This notion of relatedness or relevancy is clearly domain-dependentand is itself a research challenge. In this context, Elbedweihy et al. suggesteda notion of relatedness based on collaborative knowledge found in query logs.
Beneﬁt of Displaying Generated Formal Query Depends on User Type:
While casual users often perceived the formal query generated by a tool as
Evaluating Semantic Search Query Approaches Table 3. Query input time (in seconds) required by expert and casual users
Expert Users 88.86 Casual Users 72.8 confusing, experts liked the ability to see the formal representation of their con-structed query since it increased their conﬁdence in what they were doing. Indeed,being able to perform direct changes to the formal query increased the expressive-ness of the query language as perceived by expert users.
Experts Plan Query Formulation More Than Casuals: As shown in Ta-
ble with most of the tools, expert users took more time to build their queries
than casual ones. The feedback showed that the latter often spent more time
planning – and verbally describing – their rationale (e.g. "so it understands ab-
breviations and it seems to work better with sentences than with keywords")
during query formulation. Interestingly, studies on user search behaviour found
similar results: Tabatabai and Shore found that "Novices were less patient and
relied more on trial-and-error." p.238] and Navarro-Prieto et al. showed
that "Experienced searchers . planned in advance more than the novice partic-
In this paper, we have discussed a usability study of ﬁve semantic search toolsemploying four diﬀerent query approaches: free-NL, controlled-NL, graph-basedand form-based. The study – which used both expert and casual users – hasidentiﬁed a number of ﬁndings, the most important are summarised below.
Graph-based approaches were perceived by expert users as intuitive allowing them to formulate more complex queries, while casual users, despite ﬁndingthem diﬃcult to use, enjoyed the visually-appealing interfaces which createdan overall pleasant search experience. Also, showing the entire ontology helpedusers to understand the data and the possible ways of constructing queries.
However, unsurprisingly, graph-based approach was judged as laborious and timeconsuming. In this context, the form-based approach required less input time. Itwas also perceived as a midpoint between NL-based and graph-based, allowingmore complex queries than the ﬁrst, yet less complicated than the latter.
Additionally, casual users found the controlled-NL support to be very helpful whereas expert users found it to be very restrictive and thus preferred the ﬂexi-bility and expressiveness oﬀered by free-NL. A major challenge for the latter wasthe mismatch between users' query terms and ones expected by the tool (habit-ability problem). The results also support the literature showing that negation isa challenge for semantic search tools : only one tool provided partial sup-port for negation. Furthermore, the study showed that users often requested thesearch results to be augmented with more information to have a better under-standing of the answers. They also mentioned the need for a more user-friendly K. Elbedweihy, S.N. Wrigley, and F. Ciravegna results presentation format. In this context, the most liked presentation was thatemployed by K-Search, providing results in a tabular format that was perceivedas clear and visually-appealing.
To conclude, this usability study highlighted the advantage of visualising the search space oﬀered by view-based query approaches. We suggest combining thiswith a NL-input feature that would balance diﬃculty and speed of query formu-lation. Indeed, providing optional guidance for the NL input could be the bestway to cater to both expert and casual users within the same interface. Theseﬁndings are important for developers of future query approaches and similar userinterfaces who have to cater for diﬀerent types of users with diﬀerent preferencesand needs. For future work and, indeed, to have a more complete picture, we planto assess how the interaction with the search tools aﬀect the information seekingprocess (usefulness). To achieve this, we will use questions with an overall goal– as opposed to ones which are not part of any overarching information need– and compare users' knowledge before and after the search task. This wouldalso allow us to evaluate advanced features such as formulating complex queries,merging results of subqueries or assessing relevancy and usefulness of additionalinformation presented with the results.
1. Tummarello, G., Delbru, R., Oren, E.: Sindice.com: Weaving the Open Linked Data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B.,Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudr´ P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 552–565. Springer,Heidelberg (2007) 2. Kaufmann, E., Bernstein, A.: Evaluating the usability of natural language query languages and interfaces to semantic web knowledge bases. J. Web Sem. 8 (2010) 3. Lopez, V., Motta, E., Uren, V.: PowerAqua: Fishing the Semantic Web. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 393–410. Springer,Heidelberg (2006) 4. Harth, A.: VisiNav: A system for visual search and navigation on web data. J. Web Sem. 8, 348–354 (2010) 5. Saracevic, T.: Evaluation of evaluation in information retrieval. In: Proc. SIGIR 6. Halpin, H., Herzig, D.M., Mika, P., Blanco, R., Pound, J., Thompson, H.S., Tran, D.T.: Evaluating Ad-Hoc Object Retrieval. In: Proc. IWEST 2010 Workshop(2010) 7. Cleverdon, C.W.: Report on the ﬁrst stage of an investigation onto the compara- tive eﬃciency of indexing systems. Technical report, The College of Aeronautics,Cranﬁeld, England (1960) arck Jones, K.: Further reﬂections on TREC. Inf. Process. Manage. 36, 37–85 9. Voorhees, E.M.: The Philosophy of Information Retrieval Evaluation. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds.) CLEF 2001. LNCS, vol. 2406, pp.
355–370. Springer, Heidelberg (2002) 10. Ingwersen, P., J¨ arvelin, K.: The Turn: Integration of Information Seeking and Re- trieval in Context. Springer (2005) Evaluating Semantic Search Query Approaches 11. Salton, G.: Evaluation problems in interactive information retrieval. Information Storage and Retrieval 6, 29–44 (1970) 12. Tague, J., Schultz, R.: Evaluation of the user interface in an information retrieval system: A model. Inf. Process. Manage., 377–389 (1989) 13. Hersh, W., Over, P.: SIGIR workshop on interactive retrieval at TREC and beyond.
SIGIR Forum 34 (2000) 14. Kelly, D., Lin, J.: Overview of the TREC 2006 ciQA task. SIGIR Forum 41, 107– 15. Xie, H.: Supporting ease-of-use and user control: desired features and structure of web-based online IR systems. Inf. Process. Manage. 39, 899–922 (2003) 16. Petrelli, D.: On the role of user-centred evaluation in the advancement of interactive information retrieval. Inf. Process. Manage. 44, 22–38 (2008) 17. Tabatabai, D., Shore, B.M.: How experts and novices search the Web. Library & Information Science Research 27, 222–248 (2005) 18. Navarro-Prieto, N., Scaife, M., Rogers, Y.: Cognitive strategies in web searching.
In: Proc. the 5th Conference on Human Factors and the Web, vol. 2004, pp. 1–13(1999) 19. Balog, K., Serdyukov, P., de Vries, A.: Overview of the TREC 2010 Entity Track.
In: TREC 2010 Working Notes (2010) 20. Kaufmann, E., Bernstein, A.: How Useful Are Natural Language Interfaces to the Semantic Web for Casual End-Users? In: Aberer, K., Choi, K.-S., Noy, N., Alle-mang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi,R., Schreiber, G., Cudr´ e-Mauroux, P. (eds.) ISWC/ASWC 2007. LNCS, vol. 4825, pp. 281–294. Springer, Heidelberg (2007) 21. Elbedweihy, K., Wrigley, S.N., Ciravegna, F., Reinhard, D., Bernstein, A.: Evaluat- ing semantic search systems to identify future directions of research. In: Proc. 2ndInternational Workshop on Evaluation of Semantic Technologies (IWEST 2012)(2012) olscher, C., Strube, G.: Web search behavior of Internet experts and newbies.
Comput. Netw. 33, 337–346 (2000) 23. Popescu, A.M., Etzioni, O., Kautz, H.: Towards a theory of natural language in- terfaces to databases. In: IUI 2003, pp. 149–157 (2003) 24. Damljanovic, D., Agatonovic, M., Cunningham, H.: FREyA: an Interactive Way of Querying Linked Data using Natural Language. In: Proc. QALD-1 Workshop andez, M., Motta, E., Stieler, N.: PowerAqua: supporting users in querying and exploring the semantic web. Semantic Web 3 (2012) 26. Albert, W., Tullis, T., Tedesco, D.: Beyond the Usability Lab: Conducting Large- Scale User Experience Studies. Elsevier Science (2010) 27. Bernstein, A., Reinhard, D., Wrigley, S.N., Ciravegna, F.: Evaluation design and collection of test data for semantic search tools. Technical Report D13.1, SEALSConsortium (2009) 28. Miller, R.: Human Ease of Use Criteria and Their Tradeoﬀs. IBM, Systems Devel- opment Division (1971) 29. Kelly, D.: Methods for Evaluating Interactive Information Retrieval Systems with Users. Found. Trends Inf. Retr. 3, 1–224 (2009) 30. Hix, D., Hartson, H.R.: Developing User Interfaces: Ensuring Usability Through Product and Process. J. Wiley (1993) 31. Shneiderman, B.: Designing the User Interface: Strategies for Eﬀective Human- Computer Interaction. Addison Wesley Longman (1998) 32. Ericsson, K.A., Simon, H.A.: Protocol analysis: Verbal reports as data. MIT Press K. Elbedweihy, S.N. Wrigley, and F. Ciravegna 33. Brooke, J.: SUS: a quick and dirty usability scale. In: Usability Evaluation in Industry, pp. 189–194. CRC Press (1996) 34. Bangor, A., Kortum, P.T., Miller, J.T.: An Empirical Evaluation of the System Usability Scale. Int't J. Human-Computer Interaction 24, 574–594 (2008) 35. Bhagdev, R., Chapman, S., Ciravegna, F., Lanfranchi, V., Petrelli, D.: Hybrid Search: Eﬀectively Combining Keywords and Semantic Searches. In: Bechhofer, S.,Hauswirth, M., Hoﬀmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021,pp. 554–568. Springer, Heidelberg (2008) 36. Strauss, A., Corbin, J.: Basics of qualitative research: grounded theory procedures and techniques. Sage Publications (1990) 37. Bangor, A., Kortum, P.T., Miller, J.T.: Determining what individual SUS scores mean: Adding an adjective rating scale. J. Usability Studies 4, 114–123 (2009) 38. Uren, V., Lei, Y., L´ opez, V., Liu, H., Motta, E., Giordanino, M.: The usability of semantic search tools: a review. Knowledge Engineering Review 22, 361–377 (2007) opez, V., Motta, E., Uren, V., Sabou, M.: Literature review and state of the art on Semantic Question Answering (2007) opez, V., Uren, V., Sabou, M., Motta, E.: Is question answering ﬁt for the Se- mantic Web? A survey. Semantic Web 2, 125–155 (2011) 41. Uren, V., Lei, Y., L´ opez, V., Liu, H., Motta, E., Giordanino, M.: The usability of semantic search tools: a review. The Knowledge Eng. Rev. 22, 361–377 (2007) 42. Meij, E., Mika, P., Zaragoza, H.: Investigating the Demand Side of Semantic Search through Query Log Analysis. In: Proc. SemSearch 2009 (2009) 43. Meij, E., Bron, M., Hollink, L., Huurnink, B., de Rijke, M.: Mapping queries to the Linking Open Data cloud: A case study using DBpedia. J. Web Sem. 9, 418–433(2011) 44. Elbedweihy, K., Wrigley, S.N., Ciravegna, F.: Improving Semantic Search Using Query Log Analysis. In: Proc. Interacting with Linked Data (ILD) 2012 Workshop(2012)
Results From the AMBITION Study of First-Line Treatment With Letairis and Tadalafil in Pulmonary ArterialHypertension Published in The New England Journal of Medicine August 26, 2015 5:01 PM ET FOSTER CITY, Calif.--(BUSINESS WIRE)--Aug. 26, 2015-- Gilead Sciences, Inc. (Nasdaq: GILD) today announceddetailed results from the AMBITION study (a randomized, double-blind, multicenter study of first-line combinationtherapy with AMBrIsentan and Tadalafil in patients with pulmonary arterial hypertensION). In AMBITION, conducted