In November 2018, I spoke at the DataEngConf NYC (now called Data Council) about my work building the initial data infrastructure at Bowery Farming.
My main points were
- The challenge in standing up a data ecosystem from scratch is "completing the loop". You've got to
- collect the data
- organize it all nicely
- get it back out into the business to do something useful
- I accomplished this quickly and cheaply with
- AWS SNS, SQS, and Kinesis Firehose to collect messages from our application and IoT fleet
- Redshift to store and compute
- dbt to transform
- differentiated `ingest` and `warehouse` schemas to do extraction and transformation work in series
- Our warehouse functions as the foundation of both our human-facing analytics as well as our machine-facing ML data products
<iframe width="560" height="315" src="https://www.youtube.com/embed/chfIo1O0Cpk?si=wQzLcsmQqQ9CEW8w" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
<iframe src="https://docs.google.com/presentation/d/e/2PACX-1vQwBlws_i0NOB3E_JrZlA99v16LNkqbtw_2_kzTZAHz9tKQtW-oEPaKyKAaTVKRz3LHaq-Zs-SiZikG/embed?start=false&loop=false&delayms=3000" frameborder="0" width="960" height="569" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>