HBS D3 Machine Learning for Leaders 26 Oct 2022
HBS D3 Machine Learning for Leaders 26 Oct 2022
- AI versus ML (as definitions)
- AI is a useful, informal term (no formal, precise definition in industry or academia at this time)
- ML has a precise meaning
- technique that lets software learn behavior from sample data, rather than from pre-defined rules
- Example: rule-based programming vs ML for spam filtering
- how do you filter bad email?
- traditional approach:
- rules (if email contains X + if email contains Y + if email contains Z + …)
- ML:
- model learns patterns from labeled example emails
- classify by learning pattern
- model: a mathematical representation of patterns and insights
- model tries to classify new emails as spam or not spam
- model is refined to reduce errors
- the bigger (and more varied) the data set, the better the training
- How does a machine learn?
- teachable machine at google
- https://teachablemachine.withgoogle.com/v1/
- exercise
- create labeled image data from source based on pre-existing model
- system will learn patterns and classify images
- person face, orange, banana
- what led to errors in model?
- what examples would improve the model?
- teachable machine at google
- ML development typically involves re-use
- datasets
- large data sets are expensive and time onsuming
- often re-used
- new datasets are created from scratch for specific use cases
- ML models
- could and on-device platforms offer AI products that may be used and re-used with no modification
- DIY projects often start with pre-trained models
- datasets
- From classification to generation
- teachable machine did classification
- given input predict 1 of 3 classes
- some models can predict thousands of classes
- example: given a snippet of text, predict the next workd
- this enables generation of prose, software code, etc
- may seem like dramatic different, capability, but underlying issues remain the same
- training data drives performance
- bias is an issue
- results are often noisy
- teachable machine did classification
- generating text: why does it matter?
- researchers realized that high-performance autocomplete can be used to tackle many different problems, controlled by a plain English interface
- I saw the Red Sox play at (+ geographic knowledge)
- 3 + 7 = (+ arithmetic)
- GPT-3 (OpenAI, 2020) helped make this apparent
- global conversation about AI principles
- organizations, companies, HigherEd, governments, etc.
- Google’s AI principles
- AI should
- be socially beneficial
- avoid create or reinforce bias
- built for safety
- be accountable to people
- privacy design
- scientifici excellence
- be made available for uses that accord w/ these principles
- Google will not pursue AI projects that:
- likely to cause harm
- direct injury
- surveillance violating international norms and laws
- contravene international law and human rights
- AI should
- Concrete action for accountability:
- good documentation
- data cards
- dataset provenance
- intended / suited able use cases
- data-set make-up, distributions
- model cards
- model provenance
- model usage
- ethics-informed evaluation
- Explainability: can you explain the output of a ML system?
- explanation for whom?
- what do we mean by explain?
- helps developers identity problems
- id sources of error bias
- empower users of ML
- Is ML a black box?
- not inherently, but it IS complex
- tools for developers to understand models
- new ways to explain
- can customize explanation to roles
- some limitations
- can’t always provide comprehensive explanations
- research continues
-
Case Study on Diabetic Retinopathy (DR)
- fastest cause of preventable blindness
- not always enough doctors to diagnose
- India, for example (shortage of 127k eye doctors)
- 45% of patients suffer vision loss before diagnosis
- diagnosing DR is difficult
- requires expertise
- disagreements amongst experts
- disagreements are resolved thru discussion
- develop software linking data gathering and corresponding discussions
- how do you train a model?
- take an existing model and re-train it
- Google started with an existing model that classified images of broccoli, fish, fire trucks etc
- 130K images of eye scans provided and model re-trained
- models CAN be retrained for a new task
- incorporate privacy design principles
- transparency
- consent
- access
- health regulators have been thinking about automation for a long time
- governance documentation by federal govt
- avoid creating or reinforcing unfair bias?
- what groups need to be well-represented in eye scan dataset?
- age, sex, pupil size, image quality (in DR use case)
- ARDA (Automated Retinal Disease Assessment) was developed
- result to date
- screened 5k patients
- model performance is on par with eye speicalists
- published 2 papers
- did better than generalists, not quite as good as specialists
- opened screening sites in Thailand
- Observations
- nurses required less specialists in remote facilities
- wait times for referrals went from weeks to minutes
- imagines readable by humans were often deemed too “low quality”
- poor internet connectivity led to frustrating wait times
- Dataset size
- if you’re using a robust, pre-trained ML model, and re-training it for a more specific use case, the dataset used for re-training could be smaller
- text prediction, in particular, seems to work well w/ these types of lower dataset sizes
- more research being done on this
- articles
-
Healthcare AI systems that put people at the center
-
https://dl.acm.org/doi/abs/10.1145/3313831.3376718
-
- Takeaways
- data is critical across systems
- wide range of systems, but for each one, training data is the driver for performance, bias, etc
- we can explain some things, some of the time
- very contextual, depends on specific application and systems
- Human-AI interaction
- evaluate beyond benchmarks, direct resources towards understanding people in the loop, societal consequences
- data is critical across systems