Harvard IT Summit 11 June 2025

Keynote Michael Smith You, Me, and ChatGPT

  • Harvard relies on
    • people
    • physical space
    • digital tech
  • unix pipeline that spell checks in Kernighan’s book, written by Steve Johnson
    • small programs linked together that does one thing well
  • this paradigm doesn’t apply any more
    • think how big Grammarly is
    • OpenAI is an even bigger example “everything for everyone”
  • unix: pipelines
  • GAI: prompts
  • tech trend: pipelines to prompts
  • Klara’s email sent to grammarly, Johnson’s unix pipe command, and sandbox
    • grammarly didn’t surface a spelling issue
    • but the old unix command created 44 false positives (fake spelling errors)
    • chatgpt in sandbox: no spelling errors, but no proof with prompt 1
    • prompt 2: re-create the unix command with prompt instructions
      • this actually created 40 fake errors
    • ran it again a few days later (how deterministic is the process?)
    • still came up with a list of fake errors (about 26). 3rd pass: 14
  • AI has an explainability problem
  • recent article:
    • Prompts matter
    • jagged frontier of GAI knowledge (it knows some things really well, others not so much, but sounds authoritative for both answers)
  • Chatbots and education
    • challenges
      • bad prompt -> student frusdtration
      • unknown frontier -> believable hallucinations
      • eager to answer, drives student cognitive effort to zero
      • good looking response: undeserved trust
    • opportunities
      • lots of knowlege in models
      • always available
      • fast to response
      • important skill / tool for the future
  • Anecdote
    • don’t use chatgpt to outline argument
    • probably OK to sort references
    • but Michael had a student do exactly this and ChatGPT hallucinated references!
  • His Case studies
    • comp sci 32 (intro to programming)
      • constrainted chatbot as another way to get help
      • possible for students to get too much programming help
    • comp 221 (critical thinking in data science. has writing assignments)
      • use GAI to change what we ask of the students
      • possible for students to get too much writing/thinking help
      • less of a programming class, more on ethics
      • open with foundational aspects of ethics and intersection with technology
        • deep dive on historical impact of tech on privcy
        • not HOW to do data science
      • old version of class got boring after spring break which was more reading and discussion
      • new version: still some reading, but group-project work enabled by GAI
      • 2 week sprint is an alpha-like demo (and criticize your work)
        • 2 deep fake videos on the ethics of voice cloning and deep fakes
  • dropping cost of cognition
    • how are we going to use this new more abundant resource?
      • experienced programmers in coding tasks (yes)
      • students in creating a working prototype (yes)
      • students in (existing) writing tasks (no)
      • teachers in leading a class (yes)
  • Ending observations
    • testing of GAI tools are our problem (unlike with other software products)
    • new challenges in protecting the integrity of university’s work projects, raise awareness of dangers of overly trustful (similar to trust but verify principles in InfoSec)
    • wonderful ac2221 student paper on chatbots as comopaneions and will be a new threat vector as we become too friendly and trusging of them, telling them things we shouldn’t
  • Q&A
    • IT can help faculty leverage GAI thru rigorous testing guidance
    • and
    • help out early adopters
      • not just provide the tool but understand what problems they’re trying to solve
      • partnership
      • help it propagate to others

Event Driven Architecture in LXP

  • bulk transaction and async operations in LXP
    • sometimes you need ot at on many items of data at once
    • API limitations:
      • timeouts, spiky consumption, partial failure
    • awkward solution: make the API an async drop box with a submit and chek status poll
    • better solution: use events to submit the request and to repot the progress
    • don’t have to rebuilt any spike smoothing or retyr logic for each call – its already part of event delivery
  • 3rd party applications in LXP
    • school specific CRP’s
      • can emit events and LXP consume them, inclufing bulk operations which dont’ work well in API’s
    • custom learner dashboards
      • updates about enrollment, course progress, etc.
    • custom analytics dashboards
    • doesn’t answer all integration needs, still need to bootstrap earlier state; can mix and match with traditional API’s
  • Pattern 1: notification events
    • emitters say what happened, not what consumers should do with it (contrast with commands)
    • encourages decoupling and multipoe consumers
    • LXP ues this for analytics reporting
    • downside: harder to reason about the flow thru layers
  • Pattern 2: state change events
    • LXP is decentralized where each component maintains its own internal state (versus a centralized DB)
    • event emitter includes all info it knows about the state being changed and each consumer updates its own notion of the state based on whichever subset of that data is relevant to its domain needs
    • eventual consistency (computer time vs human time)
    • LXP is currently migrating from a single-consjmer sync system to a general even tsystem for this purpose
  • Pattern 3: event sourcing
    • a replayable leder of state change events like git version control
    • auditing, recovery, bootstrapping new consumers
    • LXP does not yet have plans to utilize event sourcing
      • although it does have some aspects of it as a side effect of other implementation details
  • Event driven systems: AWS architectures
    • Kinesis (data streaming)
    • eventbfridge (routing and filtering) can target aWS services or HTTP endpoints
    • SQS / SNS (fanout, queuring and rate-limiting, notificatio of problems)
    • Lambda and step functions (onward lgoic)
    • Talk to your AWS technical consultant
  • LXP state-change events
    • reference CNCF cloud events specification (on github)
    • dont’ reinvent the wheel. data payloads can be any object
  • LXP state change events
    • progress so far
      • clotho creates events
      • events are distributed to CDA
      • CDA makes info avaialble
  • next steps
    • add distribution layers to both CDA and to consuming tenants (like HBSO and VPAL)
    • splice into prod, turn off HTTP REST API class
    • cDA to emit state change events (clotho listens and acts)
    • switft and async updates to CDA (and clotho with the same event)
    • bulk operations (many events and one event wrapper with multiple event payloads)