` Data Mining and Decision Research Page for Dr Darryl N. Davis
[Picture]

Knowledge Engineering and Data Mining


Knowledge Engineering and Discovery in Medical (and other) Domains

I have been working in knowledge engineering and data mining since my MSc (1986/7). The MSc thesis concerned the application of machine learning to a logic database (a urology database from an English Hospital). Rather simplistically I consider data mining and knowledge discovery as part of a larger enterprise; that of knowledge engineering. Recent projects include:

  • Framework 7 Project: BRAVEHEALTH Patient Centric Approach for an Integrated, Adaptive, Context Aware Remote Diagnosis and Management of Cardiovascular Diseases
  • Phillips Healthcare Project: Advanced Medical Intelligence Predicting heart failure through application of data mining to large datasets.
  • Framework 7 Network of Excellence Project: Semantic Health SemanticHealthNet will develop a scalable and sustainable pan-European organisational and governance process for the semantic interoperability of clinical and biomedical knowledge, to help ensure that EHR systems are optimised for patient care, public health and clinical research across healthcare systems and institutions.
  • HEIF-5 University of Hull TeleHealth: Advancing Computational Frameworks for TeleHealth Two two year PDRAs investigating issues related to ongoing DCS Telehealth; in particular: Dependable and Adaptive Frameworks for TeleHealth; and, Computational Issues and Case Studies in TeleHealth.
  • SEED Funded PhD Project: Data Mining in Medicine Using Fuzzy Logic.Full DCS-SEED Scholarship to M. Mostafizur Rahman supervised by D.N. Davis.
  • University Funded PhD Project: Handling Complexities in Large Clinical Datasets.Full University Scholarship to Lisa Moore supervised by C.Kampbhampati, D.N. Davis and J. Cleland (HYMS).
  • KTP Project: E-Business Intelligence Improving business intelligence using decision support and data mining technology
  • KTP Project: Deer Initiative Grant Funded by Natural Environment Research Council and Welsh Government, The project will develop an adaptive management programme underpinned by a computer-based decision support tool for the sustainable management of wild deer impacts on human interests. This knowledge and capability will be transferred to the DI for commercial and societal benefits and promulgation to stakeholders
  • Pattern Recognition Techniques used to date include:

    Tree Induction Over Logic Databases
    Rule Induction Over Logic Databases
    Statistical Classification for Image Feature Classification
    Evolvable Blackboard Architectures for Solution Optimisation
    MultiVariate Pattern Recognition for MicroFossil Classification
    Adaptive Contour Models for Neuroanatomical Feature Classification
    MultiLayer Perceptron and Support Vector Machines for CardioVascular Medicine
    Data Reduction using Pawlak Sets (Charlotte Bean)
    Tree Induction for CardioVascular Medicine using CART, DTREG, XpertRuleMiner, WEKA Algorithms, etc. etc.
    Association Rule Generation using WEKA Algorithms
    Bayesian Entropy Feature Selection and Filtering


    Publications related to he development of diagnostic models in medicine

    An Algorithm for Fast Mining Top-rank-k Frequent Patterns based on N-list data structure,
                        Qian Wang, Jiadong Ren, Darryl N Davis, Yongqiang Cheng Intelligent Automation & Soft Computing (Autosoft) , September 2017
    Missing Value Imputation Using Stratified Supervised Learning for Cardiovascular Data,
                        M.M. Rahman and D.N. Davis, Journal Informatics and Data Mining. 2016, 1:13 2016.
    Mining frequent biological sequences based on bitmap without candidate sequence eneration
                        Qian Wang, Darryl N Davis, Jiadong Ren Computers in Biology and Medicine, 2016 Feb 1;69:152-7. doi: 10.1016/j.compbiomed.2015.12.016
              Author Posting. (c) 'Copyright Holder', 2016. This is the author's version of the work. It is posted here by permission of 'Copyright Holder' for personal use, not for redistribution
    Machine Learning Based Data Pre-processing for the Purpose of Medical Data Mining and Decision Support
                        M. Mostafizur Rahman, PhD Thesis, Department of Computer Science, University of Hull, July 2014.
    Semi Supervised Under-Sampling: A Solution to the Class Imbalance Problem for Classification and Feature Selection
                        M.M. Rahman and D.N. Davis, Chapter in: Transactions on Engineering Technologies (Special Volume of the World Congress on Engineering 2013) pp 611-625, Springer, 2014.
              Author Posting. (c) 'Copyright Holder', 2014. This is the author's version of the work. It is posted here by permission of 'Copyright Holder' for personal use, not for redistribution
    Use of Cumulative Information Estimations for Risk Assessment of Heart Failure Patients
                        2014 IEEE International Conference on Fuzzy Systems, Bejing 2014
              Author Posting. (c) 'Copyright Holder', 2014. This is the author's version of the work. It is posted here by permission of 'Copyright Holder' for personal use, not for redistribution.
    Analysis of fuzzy decision trees on expert fuzzified heart failure data,
                        IEEE International Conference on Systems, Man, and Cybernetics, Special Session: Soft Computing - C12-01, Year: 2013, Pages: 350-355, ISBN: 978-0-7695-5154-8.
              Author Posting. (c) 'Copyright Holder', 2013. This is the author's version of the work. It is posted here by permission of 'Copyright Holder' for personal use, not for redistribution.
    Prediction of mortality rates in heart failure patients with data mining methods,
                        Annales UMCS, Informatica, Volume: 13, Issue: 2, Year: 2013, Pages: 7-16, ISSN: 1732-1360, DOI: 10.2478/v10065-012-0046-7.
              Author Posting. (c) 'Copyright Holder', 2013. This is the author's version of the work.
    Alternating decision tree applied to risk assessment of heart failure patients,
              Author Posting. (c) 'Copyright Holder', 2013. This is the author's version of the work. It is posted here by permission of 'Copyright Holder' for personal use, not for redistribution.
                        Journal of Information Technologies, Volume: 6, Issue: 2, Year: 2013, Pages: 25-33, ISSN: 1337-7467..
    Cluster Based Under-Sampling for Unbalanced Cardiovascular Data,
              Author Posting. (c) 'Copyright Holder', 2013. This is the author's version of the work. It is posted here by permission of 'Copyright Holder' for personal use, not for redistribution.
                        The 2013 International Conference of Data Mining and Knowledge Engineering (ICDMKE'13), World Congress on Engineering 2013 (WCE 2012) (July 3-5, 2013, London).
    Addressing the Class Imbalance Problem in Medical Datasets,
                        2nd International Conference on Knowledge Discovery ICKD 2013 Copenhagen, Denmark. May 19-20, 2013.
              Author Posting. (c) 'Copyright Holder', 2013. This is the author's version of the work. It is posted here by permission of 'Copyright Holder' for personal use, not for redistribution.
                        International Journal of Machine Learning and Computing, IJMLC 2013 Vol.3(2): 224-228 ISSN: 2010-3700 DOI: 10.7763/IJMLC.2013.V3.307
    Machine Learning Based Missing Value Imputation Method for Clinical Datasets
              Chapter 19 in: IAENG Transactions on Engineering Technologies
              - Special Issue of the World Congress on Engineering 2012, Yang, Gi-Chul; Ao, Sio-long; Gelman, Len (Eds.), 2013, VII, 814 Springer
    Fuzzy rule-based system applied to risk estimation of cardiovascular patients,
              Author Posting. (c) 'Copyright Holder', 2013. This is the author's version of the work. It is posted here by permission of 'Copyright Holder' for personal use, not for redistribution.
              The definitive version is to be published in Journal of Multiple-Valued Logic and Soft Computing, Volume 20, Number 5-6, pp.445-466, 2013.
    Home Telemonitoring Reduces Hospitalization for Heart Failure in an NHS Service: A Propensity - Matched Analysis,
              Author Posting. (c) 'Copyright Holder', 2013. This is the author's version of the work. It is posted here by permission of 'Copyright Holder' for personal use, not for redistribution.
              Third Annual International Congress on Telehealth and Telecare, Innovation, Integration, Implementation, The King's Fund, London, UK, 1–3 Jul 2013.
    Risk estimation of heart failure patients using Weka,
              Author Posting. (c) 'Copyright Holder', 2013. This is the author's version of the work. It is posted here by permission of 'Copyright Holder' for personal use, not for redistribution.
              Published in Proc. of the International Conference on Open Source Software in Education, Research and IT Solutions (OSSConf 2013), pp. 27-32, 2013.
              The Society for Open Information Technologies (SOIT) in Bratislava, Slovakia, Zilina, Slovakia. ISBN: 978-80-970457-3-9
    Diagnosis and Management of Cardiovascular Disease with an Intelligent Decision-Making Support System
              ULAB Journal of Science and Engineering, Vol. 3, 2012, pp2-6.
    Risk Estimation of Cardiovascular Patients using weka
              Open Source Software in Education, Research and IT Solutions (OSSConf 2012), Slovekia 2012.
    Alert rules for remote monitoring of cardiovascular patients
              Journal of Information Technologies Vol. 5, No. 1, 2012, ISSN 1337-7469
    A Comparative Study of Missing Value Imputation with Multiclass Classification for Clinical Heart Failure Data
              The 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD' 12), 29-31 May 2012, Chongqing, China
    Fuzzy Unordered Rules Induction Algorithm Used as Missing Value Imputation Methods for K-Mean Clustering on Real Cardiovascular Data
              ICDMKE_5 : The 2012 International Conference of Data Mining and Knowledge Engineering, World Congress on Engineering 2012 (WCE 2012) (July 4-6, 2012, London)
    Estimation of cardiovascular patient risk with a Bayesian network
              TRANSCOM 2011, 9th European Conference of Young Research and Scientific Workers, Žilina June 27 – 29, 2011 Slovak Republic
    Data mining applied to cardiovascular data
              Journal of Information Technologies Vol. 3, No. 2, November 2010, ISSN 1337-7469
    Generating and Verifying Risk Prediction Models Using Data Mining: A Case Study from Cardiovascular Medicine
              Published as Chapter in "Data Mining and Medical Knowledge Management: Cases and Applications,
              Editors: Petr Berka, Jan Rauch, & Djamel Abdelkader Zighed, IGI Global Inc. 2009.
    Predicting Cardiovascular Risks using Pattern Recognition and Data Mining
              Thuy Thi Thu Nguyen, Ph.D Thesis, Computer Science, University of Hull, August 2009.
    Generation and Verification of Risk Prediction Models for Carotid Endarterectomy using Data Mining and Neural Network Techniques
              European Society for Cardiovascular Surgery 57th Annual Congress of ESCVS, April 24-27, 2008 Barcelona Spain
    A Clustering Algorithm For Predicting CardioVascular Risk
              The 2007 International Conference of Data Mining and Knowledge Engineering, London, U.K., 2-4 July, 2007
    Feature Selection and Predicting CardioVascular Risk
              University of Hull Second Biosciences Workshop, December 2006.
    Predicting Cardiovascular Risks Using POSSUM, PPOSSUM and Neural Net Techniques
              ICEIS2006, 8th International Conference on Enterprise Information Systems, May 2006.
    Predicting CardioVascular Risk Using Neural Net Techniques
              University of Hull Biosciences Workshop, June 2005.
    The use of Artificial Neural Networks for risk prediction following Carotid endarterectomy
              Unpublished Paper, 2001.

    Multi-Agent Decision Support Systems (MADSS)

    Ongoing work in developing MADSS (Multiple Agent Decision Support System). MADSS is an agent-based decision support framework that can be of use in supplying decision-enabling information in a number of domains. This builds on my PhD and subsequent work in applying blackboard systems to medical image problems. We suggest that behaviours useful in solving problems, associated with specific information domains, results from designing specific architectures for particular types of agent communities. The grain of the design and architecture varies with the domain and task. Specific applications include machine vision in medicine, water supply infrastructure decision making, stock-trading portfolio management, cardio-vascular diagnosis and prognosis.
    Combining KADS with Zeus to Develop a Multi-Agent E-Commerce Application,
        International Journal of Electronic Commerce Research, 3(3-4):315-335, Kluwer, 2003
    A Multi-Agent System Framework for Decision Support in Stock Trading
       The IEEE Network Magazine Special Issue on Enterprise Networking and Services, Vol.16, No. 1, Jan/Feb 2002
    Information and Knowledge Exchange in a Multi-Agent system for Stock Trading
        IEEE/IEC Enterprise Networking Applications and Services Conference (EntNet2001), Atlanta, USA, July 2001.
    Using KADS to Design a Multi-Agent Framework for Stock Trading
      Agents for E-Business on the Internet, The 2001 International Multi-Conference Event, Las Vegas June 2001.
    Agent-Based Decision Support Framework for Water Supply Infrastructure and Development
        International Journal of Computers, Environment and Urban Systems, 24, 173-190, 2000
    A Multi-Agent Framework for Stock Trading
        World Computing Conference 2000, Beijing, August 2000
    Using KADS to Build Agents for E-Commerce
        Journal of Applied Software Systems, 2000
    The Application of Expert System and Agent Technology to Water Mains Rehabilitation Decision Making
        New Review of Applied Expert Systems, 5, 5-18, 1999
    An Agent Framework for Decision Support in the Water Industry
        International Conference on Artificial Intelligence Applied to Soft Computing, Honolulu. 1999.

    Tools used in building these systems include:

    weka
    CRoss Industry Standard Process for Data Mining (CRISP)
    Rosetta, CART etc
    Knowledge Analysis and Design System (KADS)
    XML as agent language
    KQML as agent language
    The Knowledge Acquisition Grid (KAG)


    File maintained by Dr D.N.Davis AT hull.ac.uk