<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>it’s all semantics &#187; classification/tagging</title>
	<atom:link href="http://semedica.wordpress.com/category/classificationtagging/feed/" rel="self" type="application/rss+xml" />
	<link>http://semedica.wordpress.com</link>
	<description>Semantic Strategy Insights for Publishers</description>
	<lastBuildDate>Mon, 17 May 2010 14:51:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='semedica.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/c65cb61b7068f8507109857024ad8976?s=96&#038;d=http://s2.wp.com/i/buttonw-com.png</url>
		<title>it’s all semantics &#187; classification/tagging</title>
		<link>http://semedica.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://semedica.wordpress.com/osd.xml" title="it’s all semantics" />
	<atom:link rel='hub' href='http://semedica.wordpress.com/?pushpress=hub'/>
		<item>
		<title>The Real World &gt; Silverchair</title>
		<link>http://semedica.wordpress.com/2010/04/02/the-real-world-silverchair/</link>
		<comments>http://semedica.wordpress.com/2010/04/02/the-real-world-silverchair/#comments</comments>
		<pubDate>Fri, 02 Apr 2010 21:38:37 +0000</pubDate>
		<dc:creator>Elizabeth Willingham</dc:creator>
				<category><![CDATA[classification/tagging]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[thesaurus]]></category>
		<category><![CDATA[medical terminology]]></category>
		<category><![CDATA[semantic tagging]]></category>

		<guid isPermaLink="false">http://blog.silverchair.com/?p=405</guid>
		<description><![CDATA[The thesaurus supporting our Cortex medical taxonomy is distinguished from other standards by its inclusion of “real-world” equivalents. We generally call these “equivalents” rather than synonyms because we include things that arguably aren’t purely synonyms—jargon or shorthand versions of medical terminology that we run across in the medical literature we tag. More often, though, we [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=semedica.wordpress.com&amp;blog=8554914&amp;post=405&amp;subd=semedica&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-407" title="The Real World logo" src="http://semedica.files.wordpress.com/2010/04/the_real_world_logo_svg.png?w=250&#038;h=187" alt="The Real World logo" width="250" height="187" />The thesaurus supporting our Cortex medical taxonomy is distinguished from other standards by its inclusion of “real-world” equivalents. We generally call these “equivalents” rather than synonyms because we include things that arguably aren’t purely synonyms—jargon or shorthand versions of medical terminology that we run across in the medical literature we tag. More often, though, we learn about these equivalents (and common misspellings, which we also add to our thesaurus) by reviewing search queries submitted by real users to the sites we’ve built. Some examples: “C diff” for “<em>Clostridium difficile</em>,” “FB in foot” for “foreign body in foot,” “P4P” for “pay for performance,” “echo” for “echocardiography.” </p>
<p>Unlike some taxonomies that have a more “academic” (read: stodgy) approach to what is considered a synonym, we put real-world equivalents in our thesaurus because we want it to work for <em>real-world users.</em> Many users of our health care information sites are pressed for time and are looking for an answer to a specific question. They shouldn’t have to think very hard about how to structure a query so that a search engine can understand it. It’s <em>our</em> job to be knowledgeable about both their language <em>and</em> their lingo. At Silverchair, we believe the searcher is never wrong (our version of “the customer is always right”).</p>
<p>Bob Wachter, with whom we’re privileged to work on two sites sponsored by the Agency for Healthcare Research and Quality (<a href="http://www.psnet.ahrq.gov/" target="_blank">PSNet</a> and <a href="http://webmm.ahrq.gov/" target="_blank">WebM&amp;M</a>), recently wrote a <a href="http://community.the-hospitalist.org/blogs/wachters_world/archive/2010/03/04/verb-alizing.aspx" target="_blank">humorous blog post</a> about the way his hospital colleagues at UCSF (and other hospitals) commonly turn the nouns of their everyday work life into verbs as a shorthand way of communicating. For example, a resident might report that she “heparinized” her patient, or that a patient ready for discharge has been “housed and spoused,” meaning it had been determined that the patient had somewhere to go and someone to care for him. In addition, he reports the creation of new terms based on healthcare IT functionality, based, for example, on the way buttons are named in an EHR (“I done-ed it”).</p>
<p>That <a href="http://community.the-hospitalist.org/blogs/wachters_world/archive/2010/03/04/verb-alizing.aspx" target="_blank">post</a> is a fun reminder of the many ways medical lingo—and language—evolve, and the importance of attentive, systematic approaches to managing and supporting the information needs of those who invent the common parlance of their specialty in the course of doing their work (we hope, while using the sites we develop for them).</p>
<br />Filed under: <a href='http://semedica.wordpress.com/category/classificationtagging/'>classification/tagging</a>, <a href='http://semedica.wordpress.com/category/search/'>search</a>, <a href='http://semedica.wordpress.com/category/taxonomy/'>taxonomy</a> Tagged: <a href='http://semedica.wordpress.com/tag/medical-terminology/'>medical terminology</a>, <a href='http://semedica.wordpress.com/tag/search/'>search</a>, <a href='http://semedica.wordpress.com/tag/semantic-tagging/'>semantic tagging</a>, <a href='http://semedica.wordpress.com/tag/taxonomy/'>taxonomy</a>, <a href='http://semedica.wordpress.com/tag/thesaurus/'>thesaurus</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/semedica.wordpress.com/405/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/semedica.wordpress.com/405/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/semedica.wordpress.com/405/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/semedica.wordpress.com/405/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/semedica.wordpress.com/405/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/semedica.wordpress.com/405/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/semedica.wordpress.com/405/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/semedica.wordpress.com/405/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/semedica.wordpress.com/405/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/semedica.wordpress.com/405/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/semedica.wordpress.com/405/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/semedica.wordpress.com/405/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/semedica.wordpress.com/405/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/semedica.wordpress.com/405/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=semedica.wordpress.com&amp;blog=8554914&amp;post=405&amp;subd=semedica&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://semedica.wordpress.com/2010/04/02/the-real-world-silverchair/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/e71ef4b47e6ba5c898bdeabbb8e47d6e?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Elizabeth Willingham</media:title>
		</media:content>

		<media:content url="http://semedica.files.wordpress.com/2010/04/the_real_world_logo_svg.png" medium="image">
			<media:title type="html">The Real World logo</media:title>
		</media:content>
	</item>
		<item>
		<title>Evaluation of Automated Tagging Solutions</title>
		<link>http://semedica.wordpress.com/2010/02/04/evaluation-of-automated-tagging-solutions/</link>
		<comments>http://semedica.wordpress.com/2010/02/04/evaluation-of-automated-tagging-solutions/#comments</comments>
		<pubDate>Thu, 04 Feb 2010 14:43:29 +0000</pubDate>
		<dc:creator>Jake Zarnegar</dc:creator>
				<category><![CDATA[classification/tagging]]></category>
		<category><![CDATA[semantic enrichment]]></category>
		<category><![CDATA[semantic tagging]]></category>
		<category><![CDATA[automated tagging]]></category>
		<category><![CDATA[Tagmaster]]></category>
		<category><![CDATA[Cortex]]></category>

		<guid isPermaLink="false">http://blog.silverchair.com/?p=360</guid>
		<description><![CDATA[  As we at Silverchair and Semedica see more and more interest in automated tagging solutions (such as our Tagmaster system), we are more frequently encountering questions about how to evaluate their results. Here are a few ideas on the subject: Evaluation: Humans Required! It is hard to get around the fact that you will need human editors (or professional [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=semedica.wordpress.com&amp;blog=8554914&amp;post=360&amp;subd=semedica&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:center;"> </p>
<p>As we at Silverchair and Semedica see more and more interest in automated tagging solutions (such as our <a href="http://www.semedica.com/tagmaster.aspx" target="_blank">Tagmaster</a> system), we are more frequently encountering questions about how to evaluate their results. Here are a few ideas on the subject:</p>
<h1>Evaluation: Humans Required!</h1>
<p>It is hard to get around the fact that you will need human editors (or professional indexers) and your human technology team (who will use the tags to create interesting new features) to verify that an automated system is working correctly and that the tagging is accurate and useful. </p>
<p>Recently, someone asked our CEO Thane Kerner if we had an automated system to verify the accuracy of our automated tagging. Thane replied (rather cheekily, I must say): “If we had an automated review system that could measure tagging accuracy more precisely than the current tagging system, we wouldn’t use it to verify tags, we’d use it to tag the content to begin with!” The lesson: Once you’ve deployed your best automated system to do the tagging, humans are the next logical reviewers. </p>
<p>Here are four factors your humans should consider in their review:</p>
<div id="attachment_372" class="wp-caption alignright" style="width: 310px"><a href="http://semedica.files.wordpress.com/2010/02/tagmaster_content_page1.gif"><img class="size-medium wp-image-372" title="View inside Semedica's Tagmaster" src="http://semedica.files.wordpress.com/2010/02/tagmaster_content_page1.gif?w=300&#038;h=192" alt="View inside Semedica's Tagmaster" width="300" height="192" /></a><p class="wp-caption-text">View inside Semedica&#39;s Tagmaster, showing tags automatically inserted at the paragraph level</p></div>
<h2>1.  Expert/Editorial Accuracy Confidence</h2>
<p>One key target for evaluation is to assess how much confidence your key stakeholders (journal boards, editors, etc.) express in the output of the system. But confidence is not a linear equation. I posit the following values:</p>
<ul>
<li>Impeccable tag placement: +1</li>
<li>Debatable tag placement: −1</li>
<li>Debatable tag omission: −1</li>
<li>Obvious tag omission: −10</li>
<li>Obvious irrelevant tag placement: −50</li>
</ul>
<p>The first thing you’ll notice is the weight of positive to negative. In high-stakes fields (including science and medicine), humans are naturally biased to more heavily favor negative experiences.  (Of course, this has aided us well in survival: “Don’t eat that type of berry again, it made you sick last time!”) What that means in terms of confidence is that stakeholders will need a<em> disproportionate amount</em> of positive reassurance to get over negative outcomes. And the impact of a particularly egregious negative outcome (resulting from a particularly poorly placed tag) can be devastating to your stakeholder’s impression of a tagging system. (This is why Silverchair’s system defaults to using conservative methods with very little “guessing” to avoid obvious irrelevant tag placement.) </p>
<h2>2.  Usefulness!</h2>
<p>The next key target for evaluation for both editorial and technical stakeholders to assess is <em>usefulness</em> of the tagging applied. Tags should be highly relevant in a domain-specific context and they should drive better discoverability and linking. Primary care, genetics, surgery, and emergency care all take very different approaches to the same topics, and their tagging should reflect their uses. </p>
<p>The tagging system you are evaluating may have added tagged concepts that are tangential or irrelevant to the use model of the content, and such tags would not be capable of driving innovative site features (in many cases, tangential tagging actually <em>inhibits</em> the ability for new systems to work effectively). For example, it is a nice-to-have if your tagging system can recognize place names and person names, but if it misses or miscategorizes important topics like clinical trial names it doesn’t matter how many people or places it can tag. (Clinical trial acronyms can be particularly tricky to tag―<a href="http://blog.silverchair.com/2010/01/26/searches-for-clinical-trials-we-can-do-better/" target="_blank">see our post</a> about them.)</p>
<h2>3.  Granularity</h2>
<p>Does the system still work with “documents” or can it identify topics down to the section/paragraph/figure/table/equation level? At Silverchair we work with many dense medical chapters that may cover more than 200 distinct topics, so we see it as a necessity for our tagging system to break those documents down into smaller parts in order to deliver precise packets of highly relevant information to our users.</p>
<h2>4.  Control and Ongoing Improvement</h2>
<p>Any system selected is not going to be extremely accurate “out-of-the-box.” (I write that as a realist, not as a pessimist!) So during evaluation you must ask, “How easy is it to make impactful positive changes to the system?” This can take a variety of methods—some systems suggest manually selecting training documents for each topic or category (which can get onerous when you have 20,000 topics), some systems allow your software developers to go in and tinker with the code (you have data classification expert software developers, right?!?), and some systems allow you to load and use a taxonomy or thesaurus to aid in topic identification and tagging (assumes a taxonomy/thesaurus exists or can be created for your domain).</p>
<p>At Silverchair, we work primarily in medicine, which is a taxonomy-rich domain with an ever-growing list of topics. For that reason, we’ve chosen the last method as our control and improvement strategy. Our editors update our <a href="http://semedica.com/cortex.aspx" target="_blank">Cortex</a> medical taxonomy and its related thesaurus every day to keep pace with the topics being written about and searched for. </p>
<h1>Summary</h1>
<p>If you choose a system that 1) is accurate enough to instill confidence in your editorial team, 2) is useful enough to drive meaningful new features and improvements, 3) classifies your data at a granular level, and 4) is flexible enough to allow explicit control and ongoing improvements―you’ve made a wise purchase!</p>
<br />Filed under: <a href='http://semedica.wordpress.com/category/classificationtagging/'>classification/tagging</a>, <a href='http://semedica.wordpress.com/category/semantic-enrichment/'>semantic enrichment</a> Tagged: <a href='http://semedica.wordpress.com/tag/automated-tagging/'>automated tagging</a>, <a href='http://semedica.wordpress.com/tag/classificationtagging/'>classification/tagging</a>, <a href='http://semedica.wordpress.com/tag/cortex/'>Cortex</a>, <a href='http://semedica.wordpress.com/tag/semantic-tagging/'>semantic tagging</a>, <a href='http://semedica.wordpress.com/tag/tagmaster/'>Tagmaster</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/semedica.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/semedica.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/semedica.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/semedica.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/semedica.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/semedica.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/semedica.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/semedica.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/semedica.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/semedica.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/semedica.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/semedica.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/semedica.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/semedica.wordpress.com/360/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=semedica.wordpress.com&amp;blog=8554914&amp;post=360&amp;subd=semedica&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://semedica.wordpress.com/2010/02/04/evaluation-of-automated-tagging-solutions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f98c3087939c2c744ccaa4a42b38d3e9?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Jake Zarnegar</media:title>
		</media:content>

		<media:content url="http://semedica.files.wordpress.com/2010/02/tagmaster_content_page1.gif?w=300" medium="image">
			<media:title type="html">View inside Semedica's Tagmaster</media:title>
		</media:content>
	</item>
		<item>
		<title>Searches for Clinical Trials: We Can Do Better!</title>
		<link>http://semedica.wordpress.com/2010/01/26/searches-for-clinical-trials-we-can-do-better/</link>
		<comments>http://semedica.wordpress.com/2010/01/26/searches-for-clinical-trials-we-can-do-better/#comments</comments>
		<pubDate>Tue, 26 Jan 2010 04:55:03 +0000</pubDate>
		<dc:creator>Elizabeth Willingham</dc:creator>
				<category><![CDATA[classification/tagging]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[semantic enrichment]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[medical terminology]]></category>
		<category><![CDATA[semantic tagging]]></category>
		<category><![CDATA[Clinical trial]]></category>

		<guid isPermaLink="false">http://blog.silverchair.com/?p=344</guid>
		<description><![CDATA[Clinical trials are popular targets of searches in medical journals. To deliver accurate search and browse results for them, semantic tagging and a semantic search engine are essential. The names of clinical trials are often long and unwieldy, as they try to describe the focus and mission of the trial in their name—for example, a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=semedica.wordpress.com&amp;blog=8554914&amp;post=344&amp;subd=semedica&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Clinical trials are popular targets of searches in medical journals. To deliver accurate search and browse results for them, semantic tagging and a semantic search engine are essential.</p>
<div class="zemanta-img zemanta-action-dragged" style="display:block;margin:1em;">
<div class="wp-caption alignright" style="width: 250px"><a href="http://commons.wikipedia.org/wiki/Image:Map_of_Florida_highlighting_Jupiter.svg"><img class=" " title="Location of Jupiter in Palm Beach County, Florida" src="http://upload.wikimedia.org/wikipedia/commons/thumb/0/05/Map_of_Florida_highlighting_Jupiter.svg/300px-Map_of_Florida_highlighting_Jupiter.svg.png" alt="Location of Jupiter in Palm Beach County, Florida" width="240" height="190" /></a><p class="wp-caption-text">Location of Jupiter in Palm Beach County, Florida (image via Wikipedia)</p></div>
</div>
<p>The names of clinical trials are often long and unwieldy, as they try to describe the focus and mission of the trial in their name—for example, a clinical trial studying drug treatment of high cholesterol is “Arterial Biology for the Investigation of the Treatment Effects of Reducing Cholesterol 6–HDL and LDL Treatment Strategies.” Because of these long names, trials are more commonly known by their acronyms—in this case, “ARBITER 6–HALTS” trial—and no doubt their full names are being crafted to result in a catchy or apropos—or hopeful—acronym. For example, the acronym for the trial studying the effect of the drug Vytorin on cholesterol levels is “IMPROVE-IT.” (See this <a href="http://" target="_blank">blogpost</a> for some humorous trial names and acronyms.)</p>
<p>One of my pet peeves is the incorrect use of the word “acronym” to mean any abbreviation for a term. Actually an abbreviation is also an acronym <em>only</em> when the abbreviation spells a word or is a combination of letters that people can pronounce as a word. So yes—abbreviations of clinical trials are acronyms, and ah, there’s the rub for commonly used full-text <em>nonsemantic</em> search engines. A full-text search engine treats them like any other word.</p>
<p>So yikes—a PubMed search for “JUPITER” (the acronym for the trial “Justification for the Use of Statins in Prevention: an Intervention Trial Evaluating Rosuvastatin”) delivers the first two results correctly, but the third result appears because the name of the institution that issued the paper is in Jupiter, Florida! OK so yes—the PubMed search box tries to help you by suggesting “Jupiter trial” (98 results) … but it also suggests “Jupiter study” (257 results). People—the JUPITER trial and the JUPITER study are exactly the same thing to any searcher wanting to know about JUPITER. The number of results should be the same for both searches. And nobody searching PubMed for JUPITER wants to know more about Jupiter, Florida. Trust me.</p>
<p>We can do better. At Silverchair, our Cortex taxonomy contains a list of clinical trials and the accompanying thesaurus includes their acronyms, so when our tagging and retrieval systems encounter those concepts, we’re able to separate them from their normal English language counterparts and tag them correctly.  Yet another benefit of an automated tagging system supported by a robust and up-to-date medical thesaurus. It understands medical information and the health care professionals who depend on it so that we can give them <em>results</em>, not guesses.</p>
<div class="zemanta-pixie" style="margin-top:10px;height:15px;"><a class="zemanta-pixie-a" title="Reblog this post [with Zemanta]" href="http://reblog.zemanta.com/zemified/26c0649c-4bea-4e59-9ef3-3b5cdd916f81/"><img class="zemanta-pixie-img" style="border:medium none;float:right;" src="http://img.zemanta.com/reblog_e.png?x-id=26c0649c-4bea-4e59-9ef3-3b5cdd916f81" alt="Reblog this post [with Zemanta]" /></a></div>
<br />Posted in classification/tagging, search, semantic enrichment, taxonomy Tagged: classification/tagging, Clinical trial, medical terminology, search, semantic tagging, taxonomy <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/semedica.wordpress.com/344/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/semedica.wordpress.com/344/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/semedica.wordpress.com/344/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/semedica.wordpress.com/344/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/semedica.wordpress.com/344/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/semedica.wordpress.com/344/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/semedica.wordpress.com/344/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/semedica.wordpress.com/344/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/semedica.wordpress.com/344/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/semedica.wordpress.com/344/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/semedica.wordpress.com/344/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/semedica.wordpress.com/344/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/semedica.wordpress.com/344/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/semedica.wordpress.com/344/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=semedica.wordpress.com&amp;blog=8554914&amp;post=344&amp;subd=semedica&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://semedica.wordpress.com/2010/01/26/searches-for-clinical-trials-we-can-do-better/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/e71ef4b47e6ba5c898bdeabbb8e47d6e?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Elizabeth Willingham</media:title>
		</media:content>

		<media:content url="http://upload.wikimedia.org/wikipedia/commons/thumb/0/05/Map_of_Florida_highlighting_Jupiter.svg/300px-Map_of_Florida_highlighting_Jupiter.svg.png" medium="image">
			<media:title type="html">Location of Jupiter in Palm Beach County, Florida</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/reblog_e.png?x-id=26c0649c-4bea-4e59-9ef3-3b5cdd916f81" medium="image">
			<media:title type="html">Reblog this post [with Zemanta]</media:title>
		</media:content>
	</item>
		<item>
		<title>Internal Memory vs. External Memory</title>
		<link>http://semedica.wordpress.com/2009/12/07/internal-memory-vs-external-memory/</link>
		<comments>http://semedica.wordpress.com/2009/12/07/internal-memory-vs-external-memory/#comments</comments>
		<pubDate>Mon, 07 Dec 2009 19:39:09 +0000</pubDate>
		<dc:creator>Jake Zarnegar</dc:creator>
				<category><![CDATA[classification/tagging]]></category>
		<category><![CDATA[semantic enrichment]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[computer memory]]></category>

		<guid isPermaLink="false">http://blog.silverchair.com/?p=330</guid>
		<description><![CDATA[As we were setting up a new external SAN (storage area network) on the Silverchair production web farm recently, the network engineer said something that caught my attention: “The web servers will be able to use the external SAN drives faster than their own internal memory.” At first that defied my expectations of “internal vs. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=semedica.wordpress.com&amp;blog=8554914&amp;post=330&amp;subd=semedica&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As we were setting up a new external SAN (storage area network) on the Silverchair production web farm recently, the network engineer said something that caught my attention: “The web servers will be able to use the external SAN drives <strong><em>faster</em></strong> than their own internal memory.” At first that defied my expectations of “internal vs. external,” but when I thought about more, it made perfect sense.</p>
<p>The web servers are designed to execute application logic, store session tracking data, handle user interaction input, and synthesize, parse, and display data from a variety of sources—they are logic processing engines that handle data storage only when necessary. On the other hand, the SAN has one purpose—to store a large amount of data and enable a super-efficient data delivery channel that rapidly responds to content requests from the web servers.</p>
<p>The more I thought about it, the more I realized it was a fitting metaphor for how humans work. We are fantastic logic processing engines. We parse, synthesize, analyze, and use data input from a variety of sources to perform creative problem solving. And most importantly to this metaphor, we only store data internally when absolutely necessary. In the present day, the comprehensiveness and ubiquity of the Internet have allowed us to store an unprecedented amount of collective memory in external sources and access it from wherever we may be.</p>
<p>To be clear, human use of external memory did not arrive with the Internet—it has been around since the beginning of civilization. We are used to storing memory in external sources and freeing up our internal resources. Papyrus eliminated the need to memorize long epic poems. Abaci eliminated the need to memorize multiplication tables. (<em>NB</em>: Don’t try telling that to a 2nd grade teacher.) In modern medicine, drug handbooks store dosage and safety information that is too complex for doctors to memorize <em>in</em> <em>toto</em>. Phone numbers stored in our mobile phones eliminate the need to memorize the phone numbers of friends. We even store memories in our friends and family—I recently asked my wife, “What was the name of that hotel we liked in Chicago?” She knew, and voila, I had accessed my external memory successfully.</p>
<p>Alas, my comparison of human activity to Silverchair’s web farm breaks down at a key point. In many cases, accessing our external memory is <em>not</em> fast and efficient. Currently the external memory sources of humans are not deployed as efficiently as a SAN. Internet content sources can be hard to access, store content in highly variable forms, require a special vocabulary or technique to query, and return data in a way that does not suit our purpose.</p>
<p>This is the fundamental problem that Silverchair’s Semedica division addresses with semantic enrichment of data sources. We’re organizing a specific external memory category (in our case, online medical and health care information) in a way that allows it to be accessed more quickly and to return data in the right form for efficient use by clinicians and researchers. The less data that health care workers need to store internally, the more of their “processing time” can be used toward envisioning creative solutions for preventing and curing diseases. That is something that the Internet cannot do. (Yet.)</p>
<div class="zemanta-pixie" style="margin-top:10px;height:15px;"><a class="zemanta-pixie-a" title="Reblog this post [with Zemanta]" href="http://reblog.zemanta.com/zemified/f9513dac-6816-4e9e-8fdf-f32ea02d43aa/"><img class="zemanta-pixie-img" style="border:medium none;float:right;" src="http://img.zemanta.com/reblog_e.png?x-id=f9513dac-6816-4e9e-8fdf-f32ea02d43aa" alt="Reblog this post [with Zemanta]" /></a></div>
<br />Posted in classification/tagging, semantic enrichment Tagged: classification/tagging, computer memory, memory <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/semedica.wordpress.com/330/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/semedica.wordpress.com/330/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/semedica.wordpress.com/330/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/semedica.wordpress.com/330/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/semedica.wordpress.com/330/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/semedica.wordpress.com/330/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/semedica.wordpress.com/330/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/semedica.wordpress.com/330/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/semedica.wordpress.com/330/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/semedica.wordpress.com/330/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/semedica.wordpress.com/330/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/semedica.wordpress.com/330/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/semedica.wordpress.com/330/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/semedica.wordpress.com/330/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=semedica.wordpress.com&amp;blog=8554914&amp;post=330&amp;subd=semedica&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://semedica.wordpress.com/2009/12/07/internal-memory-vs-external-memory/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f98c3087939c2c744ccaa4a42b38d3e9?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Jake Zarnegar</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/reblog_e.png?x-id=f9513dac-6816-4e9e-8fdf-f32ea02d43aa" medium="image">
			<media:title type="html">Reblog this post [with Zemanta]</media:title>
		</media:content>
	</item>
		<item>
		<title>NIH Makes Big Strides Toward Funding Clarity, But Still Could Be Better!</title>
		<link>http://semedica.wordpress.com/2009/11/06/nih-makes-big-strides-toward-funding-clarity-but-still-could-be-better/</link>
		<comments>http://semedica.wordpress.com/2009/11/06/nih-makes-big-strides-toward-funding-clarity-but-still-could-be-better/#comments</comments>
		<pubDate>Fri, 06 Nov 2009 15:33:18 +0000</pubDate>
		<dc:creator>Jake Zarnegar</dc:creator>
				<category><![CDATA[classification/tagging]]></category>
		<category><![CDATA[semantic enrichment]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[RePORT (Research Portfolio Online Reporting Tool)]]></category>
		<category><![CDATA[Grant funding]]></category>
		<category><![CDATA[National Institutes of Health (NIH)]]></category>
		<category><![CDATA[Agency for Healthcare Research and Quality (AHRQ)]]></category>

		<guid isPermaLink="false">http://blog.silverchair.com/?p=253</guid>
		<description><![CDATA[The NIH has rolled out their new RePORT (Research Portfolio Online Reporting Tool) web site for information on funding, grants, and NIH research. As someone who works on government grants and contracts, I’m happy with this new level of transparency and clarity as to what topics (and who!) is being funded. It is a big [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=semedica.wordpress.com&amp;blog=8554914&amp;post=253&amp;subd=semedica&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div id="attachment_257" class="wp-caption alignright" style="width: 190px"><a href="http://en.wikipedia.org/wiki/Apples_and_oranges"><img class="size-full wp-image-257" title="Apples_to_Oranges" src="http://semedica.files.wordpress.com/2009/11/apples_to_oranges.jpg?w=180&#038;h=124" alt="Apples to oranges comparison" width="180" height="124" /></a><p class="wp-caption-text">Image via Wikipedia</p></div>
<p>The NIH has rolled out their new <a href="http://report.nih.gov/" target="_blank">RePORT (Research Portfolio Online Reporting Tool) web site</a> for information on funding, grants, and NIH research. As someone who works on government grants and contracts, I’m happy with this new level of transparency and clarity as to what topics (and who!) is being funded. It is a big upgrade from the incumbent system, which was hard to navigate and understand.</p>
<p>The most useful area of the site to me is the <a href="http://report.nih.gov/rcdc/categories/" target="_blank">categorical spending section</a>. It really gives you an idea of NIH’s funding priorities—it offers over 200 categories of funding.</p>
<p>However, it still has ample room for improvement. Currently it is an alphabetical list that contains items that are hard to compare. Here are some example categories that are not equivalent in scope:</p>
<ul>
<li>Allergic Rhinitis (Hay Fever)</li>
<li>American Indians / Alaska Natives</li>
<li>Burden of Illness</li>
<li>Cancer</li>
<li>Cardiovascular</li>
<li>Clinical Trials</li>
<li>Conditions Affecting Unborn Children</li>
<li>Gene Therapy</li>
<li>Gene Therapy Clinical Trials</li>
<li>Genetic Testing</li>
<li>Genetics</li>
</ul>
<p>Some are very specific (hay fever), some are broad (cancer), some are ambiguous (cardiovascular), some take a completely different approach than the dominant disease/condition approach (American Indians/Alaska Natives), and some seem to be repetitive.</p>
<p>With a bit of work, this information could be turned from its current flat list expression into a multilevel taxonomy that allows users to slice it up in the ways that appeal to them (conditions or target populations, for example). Silverchair does this for the Agency for Healthcare Research and Quality on their <a href="http://psnet.ahrq.gov/" target="_blank">PSNet</a> patient safety clearinghouse. A small amount of classification work can go a long way in creating valuable new features—NIH has proven that with their RePORT upgrade, but I’d like to see them go farther.</p>
<p>I’d be happy to help out with the NIH site, but I’m not sure what category that would be funded under…</p>
<div class="zemanta-pixie" style="margin-top:10px;height:15px;"><a class="zemanta-pixie-a" title="Reblog this post [with Zemanta]" href="http://reblog.zemanta.com/zemified/11b89d62-1cff-4672-917b-e96703a67171/"><img class="zemanta-pixie-img" style="border:medium none;float:right;" src="http://img.zemanta.com/reblog_e.png?x-id=11b89d62-1cff-4672-917b-e96703a67171" alt="Reblog this post [with Zemanta]" /></a></div>
<br />Posted in classification/tagging, semantic enrichment, taxonomy Tagged: Agency for Healthcare Research and Quality (AHRQ), classification/tagging, Grant funding, National Institutes of Health (NIH), RePORT (Research Portfolio Online Reporting Tool), taxonomy <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/semedica.wordpress.com/253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/semedica.wordpress.com/253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/semedica.wordpress.com/253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/semedica.wordpress.com/253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/semedica.wordpress.com/253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/semedica.wordpress.com/253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/semedica.wordpress.com/253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/semedica.wordpress.com/253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/semedica.wordpress.com/253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/semedica.wordpress.com/253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/semedica.wordpress.com/253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/semedica.wordpress.com/253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/semedica.wordpress.com/253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/semedica.wordpress.com/253/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=semedica.wordpress.com&amp;blog=8554914&amp;post=253&amp;subd=semedica&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://semedica.wordpress.com/2009/11/06/nih-makes-big-strides-toward-funding-clarity-but-still-could-be-better/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f98c3087939c2c744ccaa4a42b38d3e9?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Jake Zarnegar</media:title>
		</media:content>

		<media:content url="http://semedica.files.wordpress.com/2009/11/apples_to_oranges.jpg" medium="image">
			<media:title type="html">Apples_to_Oranges</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/reblog_e.png?x-id=11b89d62-1cff-4672-917b-e96703a67171" medium="image">
			<media:title type="html">Reblog this post [with Zemanta]</media:title>
		</media:content>
	</item>
	</channel>
</rss>