But wait... there's 'more'!
Below you'll find an (increasing) assortment of useful notes/links on data/bases (and un/related topics)... The point of this content is to help connect what you learn in the lectures, with the real world. There is a likely to be a Final exam question, based on what's below (so, pay attention to them!).
Look for the part about 'modeling': https://aeon.co/essays/a-mathematician-a-philosopher-and-a-gambler-walk-into-a-bar
Bye bye, ~5% of you (and, hello 'AI')... https://blogs.microsoft.com/blog/2023/01/18/subject-focusing-on-our-short-and-long-term-opportunity/
DATA is amazing, but ONLY goes so far (when it comes to emulating human-like intelligence): https://garymarcus.substack.com/p/24-seriously-embarrassing-hours-for
The underbelly/(in)human side of data-driven 'magic': https://www.digitaltrends.com/computing/investigation-exposes-murkier-side-of-ai-chatbot-industry
Here is Ed Codd's paper on relational DBs, that started it ALL!
OMG: https://futurism.com/the-byte/openai-billions-bad-ai
https://www.businessinsider.com/rise-of-oracle-founder-larry-ellison-2017-1#ellison-even-managed-to-turn-a-potential-loss-into-a-big-win-in-1999-ellisons-protg-marc-benioff-left-oracle-to-work-on-a-new-startup-called-salesforcecom-ellison-was-an-early-investor-putting-2-million-into-his-friends-new-venture-14
Math, word problems, common sense, AI: https://arxiv.org/abs/2301.09723
'Free' data transmission forever! https://theconversation.com/device-transmits-radio-waves-with-almost-no-power-without-violating-the-laws-of-physics-196271
ChatGPT-related:
https://www.cnbc.com/2023/01/31/google-testing-chatgpt-like-chatbot-apprentice-bard-with-employees.html
https://www.forbes.com/sites/richardnieva/2023/01/31/sergey-brin-code-request-lamda/
https://www.businessinsider.com/history-of-openai-company-chatgpt-elon-musk-founded-2022-12
Behind the facade: https://arstechnica.com/information-technology/2023/02/ai-powered-bing-chat-spills-its-secrets-via-prompt-injection-attack/
https://www.acm.org/diversity-inclusion/words-matter
'Generative Anything':
* https://www.intel.com/content/www/us/en/newsroom/opinion/unlocking-potential-generative-ai.html?
* https://hackernoon.com/top-10-ai-tools-to-check-out-if-youre-bored-with-chatgpt
* https://txt.cohere.ai/generative-ai-future-or-present/
'DAG'ster:
* https://www.google.com/search?q=dagit+dagster
* https://mbysolution.com/moving-past-airflow-why-dagster-is-the-next-generation-data-orchestrator/
* https://github.com/dagster-io/dagster
* https://docs.dagster.io/getting-started, https://docs.dagster.io/getting-started/hello-dagster
https://blog.bit.io/ai-powered-text-to-sql-translation-in-bit-io-1fbcf32fd586
https://medium.com/@huyphams/dataflow-a-new-toy-for-creating-internal-tools-defc62640ec1
https://icepanel.medium.com/top-8-diagramming-tools-for-software-architecture-2fc61d095b93
https://www.pcgamer.com/nvidia-predicts-ai-models-one-million-times-more-powerful-than-chatgpt-within-10-years/
A nice article on online gradient descent in SQL! https://maxhalford.github.io/blog/ogd-in-sql/ Here is my Colab notebook: https://drive.google.com/file/d/1LiC1l64Tfw7mJ-u1VYogBbB1Tj262zBd/view?usp=sharing
NN-based hash functions! https://techxplore.com/news/2023-03-method-boost-online-databases.html
Dynaboard - browser-based IDE for building web apps: https://dynaboard.com/features
'Gathr'
From OpenAI's own Lilian Weng: https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/
Chat - a new UI widget: https://openai.com/blog/chatgpt-plugins
https://www.quantamagazine.org/bob-metcalfe-ethernet-pioneer-wins-turing-award-20230322/
https://www.fermyon.com/blog/introducing-spin-v1
https://www.cncf.io/phippy/
https://medium.com/kontur-inc/kontur-ui-kit-68534ee45ba2 [more: https://www.kontur.io/]
https://blogs.oracle.com/connect/post/mysql-heatwave-lakehouse-brings-transactional-and-semistructured-data-into-one-high-speed-query-engine - a lakehouse product from Oracle [who maintains MySQL]
https://www.tpc.org/tpc_documents_current_versions/pdf/tpc-c_v5.11.0.pdf
https://www.linkedin.com/advice/3/how-do-you-use-hashing-detect-plagiarism-academic-papers
Total.js Flow - a dataflow system - use Docker to run it :) It has (comes with) LOTS of data-related components/nodes/boxes/operators.
'Generative anything' - a data analysis-oriented example: https://www.einblick.ai/prompt
A glitchy, spammy... Internet...
Airflow - a node-based (dataflow) pipeline: https://theaisummer.com/apache-airflow-tutorial
'Generators' are always fascinating - they are the SOURCE of data, geometry, pixels, audio... Here is a triangle waveform generator: https://www.edn.com/single-ic-forms-precision-triangular-wave-generator/ [you can hear what it might sound like, by doing Const -> Tri -> AudioOut, at https://noisecraft.app/ :)].
https://sustainabilitydata.usc.edu/arcgis/apps/sites/#/usc-sustainability-data-hub and https://storymaps.arcgis.com/stories/0f01047c88554a219980030773569959 - two USC undergrad projects :)
https://medium.com/neo4j/context-aware-knowledge-graph-chatbot-with-gpt-4-and-neo4j-d3a99e8ae21e
https://www.sfchronicle.com/sf/article/survey-crime-san-francisco-17894081.php
https://bdtechtalks.com/2023/04/17/open-source-chatgpt-alternatives/
LOL: https://futurism.com/the-byte/smart-glasses-gpt-4
https://www.uscc.gov/research/shein-temu-and-chinese-e-commerce-data-risks-sourcing-violations-and-trade-loopholes
https://thenewstack.io/distributed-database-architecture-what-is-it/
https://www.theverge.com/2023/4/20/23691468/google-ai-deepmind-brain-merger
https://www.latent.space/p/agents
https://towardsdatascience.com/top-10-pre-trained-models-for-image-embedding-every-data-scientist-should-know-88da0ef541cd
https://restofworld.org/2023/ai-image-china-video-game-layoffs/
Here is a report from Goldman Sachs.
https://webutility.io/csv-to-insert-sql-online [omg!]
https://cacm.acm.org/magazines/2023/4/271229-artificial-intelligence-for-materials-discovery/fulltext
'OI': https://spectrum.ieee.org/organoid-intelligence-computing-on-brain ; my take on structure-based intelligence: https://www.researchgate.net/publication/358886020_A_Physical_Structural_Perspective_of_Intelligence
https://thegradient.pub/software2-a-new-generation-of-ais-that-become-increasingly-general-by-producing-their-own-training-data/
https://github.com/booydar/t5-experiments/tree/scaling-report
https://arxiv.org/abs/2304.09349 [LLM-Brain]
How to probe GPT to reveal its most fundamental flaw - that it understands NOTHING (!): https://medium.com/@shlomi.sher/on-artifice-and-intelligence-f19224281bee
'Generative anything' - https://javascript.plainenglish.io/bill-gates-people-dont-realize-what-s-coming-dc06d3b81c9d
https://towardsdatascience.com/meta-ai-introduces-revolutionary-image-segmentation-model-trained-on-1-billion-masks-8f13c86a13a2
Decompose -> train -> recompose: https://bootcamp.uxdesign.cc/introducing-composer-the-latest-breakthrough-in-ai-image-generation-9a2350e2b9a0
LangChain: https://mohitmayank.medium.com/creating-gpt-driven-applications-using-langchain-a6ee08c383d4
Jupyter->webapp: https://towardsdatascience.com/build-elegant-web-apps-right-from-jupyter-notebook-with-mercury-78d9ebcbbcaf
https://javascript.plainenglish.io/my-boss-front-end-development-will-be-replaced-100-by-ai-354d79c79b5b
https://medium.com/codingthesmartway-com-blog/discover-thinkgpt-the-cutting-edge-python-library-that-transforms-ai-into-a-powerful-thinking-c7e588bd28b4
'Netwerk' diagram: https://observablehq.com/blog/r-to-javascript-shapiro
There goes the 'Prompt Engineering' job :) https://arxiv.org/abs/2211.01910
'Vector native' DBs: https://innerjoin.bit.io/why-you-should-care-about-vector-databases-1760186b5bf1 [look for this link in the article, where you can play with vector search: https://innerjoin.bit.io/vector-similarity-search-in-postgres-with-bit-io-and-pgvector-c58ac34f408b].
https://ai.facebook.com/blog/ai-dataset-animation-drawings
https://the-decoder.com/nvidia-shows-text-to-video-for-stable-diffusion/
https://datatables.net/examples/ [row-col data --> HTML table]
https://discord.com/blog/how-discord-stores-trillions-of-messages (thanks, Akash!)