Information Processing

Michael Riben
Note by Michael Riben, updated more than 1 year ago
Michael Riben
Created by Michael Riben almost 8 years ago


Board Exam Medical Informatics -General Note on Information Processing, created by Michael Riben on 09/24/2013.

Resource summary

Page 1

Qualitative Data- describes nominal categorical aspects often presented in tabular form, usually comprise absolute or relative frequenciesOrdinal Data - represents observable quantifications (eg: disease stages, quality of life, social status) --> ordered categoriesQuantitative data - represents countable or measureable observationsDiscrete data - correspond to countable observationsContinuous data - results of measured observations (eg blood pressures, serum glucose concentrations 

COntinuous Data -1) Usually reported in classifications , representations included histograms. Data often has associated parameters such as Mean Value, Empirical Variances, or Standard Deviations2) mean - average value3) standard deviation - spread of distribution 4) range - spread of the data - distance between largest and smallest observations5) Normal Distribution = Gaussian Curve- 95% of all observations fall btw mean +/- 2s (1.96s)6) Log-normal distribution - medical data is often this type, requiring a logarithmic transformation 7) Skewed data distribution - data has outliers8) Median = right in the middle of all observations of sorted data - 50% quantile = X0.59) 1- B = Power of Statistical test . Don't perform test for power 10) type 1 error =One falsely rejects the null hypothesis11) Type 2 error - error that Null is maintained although Altheranitive hypothesis is true  

"Stochastic" means being or having a random variable. A stochastic model is a tool for estimating probability distributions of potential outcomes by allowing for random variation in one or more inputs over time. The random variation is usually based on fluctuations observed in historical data for a selected period using standard time-series techniques. Distributions of potential outcomes are derived from a large number of simulations (stochastic projections) which reflect the random variation in the input(s)

Venn Diagrams - Graphical representation of all objects belonging to a class of objects. THe square = Domain of certain class of objects. Logical operators: Apostrophe=Not, Period = And = Logical Product , OR= Junction of A and B in a ven diagram=  + sign = OR  = Logical SUmmationE=A-->B= A implies B,  or If A then B.  This boolean logic is used frequently in Knowledge Bases and Expert Systems

De Morgans;s Laws:The rules can be expressed in English as: The negation of a conjunction is the disjunction of the negations.The negation of a disjunction is the conjunction of the negations. or informally as: "not (A and B)" is the same as "(not A) or (not B)"and also,"not (A or B)" is the same as "(not A) and (not B)" The rules can be expressed in formal language with two propositions P and Q as: where: ¬ is the negation operator (NOT)  is the conjunction operator (AND)  is the disjunction operator (OR) ⇔ is a metalogical symbol meaning "can be replaced in a logical proof with" Applications of the rules include simplification of logical expressions in computer programs and digital circuit designs. De Morgan's laws are an example of a more general concept of mathematical duality.

In logic, a tautology (from the Greek word ταυτολογία) is a formula which is true in every possible interpretation.

A formula of propositional logic is a tautology if the formula itself is always true regardless of which valuation is used for the propositional variables.There are infinitely many tautologies. Examples include:  ("A or not A"), the law of the excluded middle. This formula has only one propositional variable, A. Any valuation for this formula must, by definition, assign A one of the truth values true or false, and assign A the other truth value.  ("if A implies B then not-B implies not-A", and vice versa), which expresses the law of contraposition.  ("if not-A implies both B and its negation not-B, then not-A must be false, then A must be true"), which is the principle known as reductio ad absurdum.  ("if not both A and B, then either not-A or not-B", and vice versa), which is known as de Morgan's law.  ("if A implies B and B implies C, then A implies C"), which is the principle known as syllogism.  (if at least one of A or B is true, and each implies C, then C must be true as well), which is the principle known as proof by cases.

A Boolean algebra is a six-tuple consisting of a set A, equipped with two binary operations ∧ (called "meet" or "and"), ∨ (called "join" or "or"), a unary operation ¬ (called "complement" or "not") and two elements 0 and 1 (called "bottom" and "top", or "least" and "greatest" element, also denoted by the symbols ⊥ and ⊤, respectively), such that for all elements a, b and c of A, the following axioms hold:[1] a ∨ (b ∨ c) = (a ∨ b) ∨ ca ∧ (b ∧ c) = (a ∧ b) ∧ cassociativitya ∨ b = b ∨ aa ∧ b = b ∧ acommutativitya ∨ (a ∧ b) = aa ∧ (a ∨ b) = aabsorptiona ∨ 0 = aa ∧ 1 = aidentitya ∨ (b ∧ c) = (a ∨ b) ∧ (a ∨ c)  a ∧ (b ∨ c) = (a ∧ b) ∨ (a ∧ c)  distributivitya ∨ ¬a = 1a ∧ ¬a = 0complements

Decision Support System - ANy piece of software that takes as input information about a clinical situation and produces output  inferences that can assist in their decision  making that would be judge as intelligent by the programs users. There are 4 types of systems 1) Rule Based Systems - Production Rules2) Neural Networks3) Bayesian Networks4) Pattern Recognition or Program Code

Production rules coupled with Inference engines that do forward and backward chaining  of the rules derive conclusions for decision support . Example Production Rules Systems : Mycin -rule based system for causes of bacteremia and meningitis and appropriate therapy, EMycin was a next generation system for additional domains, CLOT was a third gen system for blood coag disorders, and PUFF was for pulmonary function tests. Expert and OPS5 were also in the 70's 

Arden Syntax: - rule based formalism -encode medical logic modules.

Experts often use "Rules of Thumb " or Heuristics when describing  how they solved   and there was little reason to doubt that they used these during their cognitionHeuristic CLassification  involves 3 types of inferences :1) Features of input cases are abstracted into higher level generalizations ( i.e. WBC is used to abstract characterizations about a patient such as immunocompromised) 2) Generalizations are associated heuristically with elements of the set of potential solutions that classify  the case ( immunocompromised state is associated with gram negative infections)3) Additional information is used to refine abstract information solutions into more specific ones. ( i..e situation suggest that the most likely cause of infection is E. Coli or Pseudomonas)QMR doesn't use heuristic classificaiton - it uses pattern - matching algorithm. 

Forward chaining (data driven) and backward chaining (hypothesis driven) techniques represent the fundamental reasoning approaches implemented in rule-based expert systems. These will be introduced using a simulation of the expert system model shown above along with a simplified representation of the "auto diagnosis" advising problem that provided a framework for the introductory tutorial.

Forward chaining. This method begins with a set of known facts or attribute values and applies these values to rules that use them in their premise. Any rules that are proven true fire and produce additional facts that are again applied to relevant rules. The process continues until no new facts are produced or a value for the goal is obtained. This approach works well when it is natural to gather multiple facts before trying to draw any conclusions and when there are many possible conclusions to be drawn from the facts.

Backward chaining. An alternative approach begins with a rule that could conclude the goal for the consultation ("what action do you recommend to get my car to start?"), tries to obtain values for the attributes used in the rule's premise, then backtracks through additional rules if necessary to determine a value of the goal attribute. When there are many attributes employed in many rules, the backward chaining mechanism produces a more efficient interview than forward chaining because it will not be necessary to ask the user to input values of all of the facts.

Backward chaining's goal oriented behavior is efficient because it avoids requests for input that won't contribute to determining the value of the consultation's goal. As a result, it provides the foundation for most rule-based expert systems. Backward chaining systems are described as hypothesis driven because they operate by selecting successive rules that can determine the value of a goal or subgoal: this value becomes the hypothesis to be proven or disproven.In

 some interview scenarios it is natural to collect data in advance, perhaps using a paper questionnaire. In other cases input to an expert system is collected automatically, perhaps using sensors on a machine. For these two situations the forward chaining approach makes sense. Forward chaining systems are described as data driven because they deduce everthing they can from a set of data rather than working backward from an hypothesis.To provide the most flexibility, many expert system shells support both forward and backward chaining even in the same interview. For example, some initial data might be requested and forward chained before the backward chaining operation of the inference engine is started. The inference engine's control capabilities enable this flexibility.

Knowledge acquisition  and design structuring - Framework for knowledge base systems that use domain ontologies KADS model containse 3 basic components:1) Basic concepts of the domain, 2) The primitive inferences that define relationships between the concepts 3) a task structure that sequences the primitive inferences

Problem Solving methods:1) Heuristic Classifications2) Constraint satisfaction3) planning4) Fault Diagnosis5) Probalistic reasoning6) abstraction of primary data into summaries7) case based reasoning

EMR Structures:1) record describes events as a function of time2) Each event contains data components3) all data components  require separate activities of people and are classified as "actions"Events contain data components for all three stages of the Diagnostic - Therapeutic cycle: Observations, Decisions, and InterventionsRelationships need be designated between actions and problems or actions to actionsTHe problem based record as defined by Weed is example of Semantics added to the record and SOAP approach links the aspects of problem together into subjective/objective/assessment/decisions and plans/interventions.CPR's have different views :1) Source oriented  - data by categories - i.e. lab, xrays, ECG, etc..2) Problem oriented - data views by problem relationship Source oriented view is content independent where as a problem oriented view is notBoth views are important

Reliability - must be highValidity Checks - should exist during data entry

Temporal Aspects of the Medical Recordit is a chronological acocunt of observations, interpretations, and interventions at different granularity and precision which is usually contextually basedan Ideal CPR would allow for 3 timestamps on data relevent to an event:1) moment data was entered2) moment when insight was gained3) moment when insight became applicableTime stamps are critical for building a record for evidence based medicine

Requirements for Data Representation Data should reflect events during all three stages of the diagnostic-therapeutic loop (obs-diag-intervent) Each event contains different actions  that reflect activities of involved providers Actions may belong to more than 1 problem and semantic relationships need to be documented  All data should have time stamp Multiple purposes in mind for data  completeness at data entry and reliability of data should be checked.

Data Entry - 1) NLP -physicians work the same way as usual, use speech recognition and nlp to determine structure and coded entries of data from 2) knowledge driven Structured Data Entry - forms, good for completeness and reliability, take longer, Also note that Patient driven data entry is increasingly used. 

Subjectivist Evaluation - Qualitative Approach = aim is to describe the world and information systems in it as they appear to individual  people - Emphasis on careful, unbiased observation, identification of Themes,  or question, and attempts to confirm and refine these by further observation- Use ETHNOGRAPHIC TECHNIQUES : analyze documents, structured/unstructured interviews, participant observation, video analysis- Painstakingly analyze data and collected - Techniques useful for defining user Requirements and assess the impact of an installed system on the experience of individual users 

Objectivist Evaluation - assumes there are truths in the world that given satisfactory measurement methods, can be recorded  and all will agree to these truths- believe there are "objects" that hav real but unobservable "attributes" which "judges" can infer by observing one or more items of information 

Evaluation Techniques

Framework for Objectivist Study: define and prioritize study questions define "system" to study select/Develop reliable valid measurements methods design demonstration study: descriptive, correlational, or comparative,  Eliminate bias Ensure resuls are generalizeable Perform Study Analyze results / Report results

Reliable: means that measurement method produces similar results irrespective of who measures it or when assuming that the quantity being measured is static i.e. within and between observer variability is lowValidity: assesses how much the measurement result reflects what it is intended to measure for example diagnostic accuracy Making reliable valid measurements is HARD!

Descriptive studies:describe attributes of an object or class of objects - NO independent Variables 

Correlational Studies: compare tow more measured quantities and attempting to correlate them without attributing cause and effect--> technically we try to correlate changes in the dependent variable with the changes in the independent variables

Comparative studies: Compare the properties of two objects or one object in 2 states to attribute a cause and effect. The main independent variable is the state of the object

understanding the different kinds of studies, and distinguishing dependent from independent variables helps the evaluator to formulate the appropriate questions and guides the evaluator to select the proper design

Eliminate Bias - Bias/Confounding  means to not trust the results becuase there is is some unknown or known systematic effect that may account for the findings

Common Types of bias Experimenter's bias is the name given for the term wherein scientists unconsciously affect subjects in experiments Attentional bias is the tendency for a particular class of stimuli to capture attention.[1] Attentional bias can also refer to the tendency of our perception to be affected by our recurring thought Sampling bias is systematic error due to a non-random sample of a population,[2] causing some members of the population to be less likely to be included than others, resulting in a biased sample, defined as a statistical sample of a population (or non-human factors) in which all participants are not equally balanced or objectively represented.[3] It is mostly classified as a subtype of selection bias,[4] sometimes specifically termed sample selection bias,[5][6][7] but some classify it as a separate type of bias.[8] patients given a placebo treatment will have a perceived or actual improvement in a medical condition, a phenomenon commonly called the placebo effect. The Hawthorne effect (commonly referred to as the observer effect) is a form of reactivity whereby subjects improve or modify an aspect of their behavior, which is being experimentally measured, in response to the fact that they know that they are being studied Repeated measures design uses the same subjects with every branch of research, including the control.[1] For instance, repeated measurements are collected in alongitudinal study in which change over time is assessed.


MOdeling for Dec Support

CPR Structure

CIS Evaluation

Show full summary Hide full summary


Michael Riben
HOMI19 Modeling Health CAre for IS Devolopment
Michael Riben
Memory Model
Bryana Brooner
HOMI-10 Image Processing and Analysis
Michael Riben
Chapter6 -HOMI- coding/Classification
Michael Riben
Chapter 14 Nursing InfoSystems
Michael Riben
Chapter 4 HOMI-Database Management
Michael Riben
Chapter 2
Mariah Teske
Chapter 2 - HOMI
Michael Riben
Chapter8-HOMI-Biosignal Analysis
Michael Riben
Michael Riben