<!-- ********************************** --> #### &nbsp; ## **But wait... there's 'more'!** ### Below you'll find an (increasing) assortment of useful notes/links on data/bases (and un/related topics)... The point of this content is to help connect what you learn in the lectures, with the real world. There is a likely to be a Final exam question, based on what's below (so, pay attention to them!). --- ### Look for the part about 'modeling': https://aeon.co/essays/a-mathematician-a-philosopher-and-a-gambler-walk-into-a-bar --- ###Bye bye, ~5% of you (and, hello 'AI')... https://blogs.microsoft.com/blog/2023/01/18/subject-focusing-on-our-short-and-long-term-opportunity/ --- ###DATA is amazing, but ONLY goes so far (when it comes to emulating human-like intelligence): https://garymarcus.substack.com/p/24-seriously-embarrassing-hours-for --- ###The underbelly/(in)human side of data-driven 'magic': https://www.digitaltrends.com/computing/investigation-exposes-murkier-side-of-ai-chatbot-industry --- ### <a href="docs/Codd_1970.pdf">Here</a> is Ed Codd's paper on relational DBs, that started it ALL! --- ###OMG: https://futurism.com/the-byte/openai-billions-bad-ai --- ###https://www.businessinsider.com/rise-of-oracle-founder-larry-ellison-2017-1#ellison-even-managed-to-turn-a-potential-loss-into-a-big-win-in-1999-ellisons-protg-marc-benioff-left-oracle-to-work-on-a-new-startup-called-salesforcecom-ellison-was-an-early-investor-putting-2-million-into-his-friends-new-venture-14 --- ### Math, word problems, common sense, AI: https://arxiv.org/abs/2301.09723 --- ### 'Free' data transmission forever! https://theconversation.com/device-transmits-radio-waves-with-almost-no-power-without-violating-the-laws-of-physics-196271 --- ###ChatGPT-related: #### https://www.cnbc.com/2023/01/31/google-testing-chatgpt-like-chatbot-apprentice-bard-with-employees.html #### https://www.forbes.com/sites/richardnieva/2023/01/31/sergey-brin-code-request-lamda/ #### https://www.businessinsider.com/history-of-openai-company-chatgpt-elon-musk-founded-2022-12 --- ###Behind the facade: https://arstechnica.com/information-technology/2023/02/ai-powered-bing-chat-spills-its-secrets-via-prompt-injection-attack/ --- ###https://www.acm.org/diversity-inclusion/words-matter --- ###'Generative Anything': ####* https://www.intel.com/content/www/us/en/newsroom/opinion/unlocking-potential-generative-ai.html? ####* https://hackernoon.com/top-10-ai-tools-to-check-out-if-youre-bored-with-chatgpt ####* https://txt.cohere.ai/generative-ai-future-or-present/ --- ###'DAG'ster: ####* https://www.google.com/search?q=dagit+dagster ####* https://mbysolution.com/moving-past-airflow-why-dagster-is-the-next-generation-data-orchestrator/ ####* https://github.com/dagster-io/dagster ####* https://docs.dagster.io/getting-started, https://docs.dagster.io/getting-started/hello-dagster --- ###https://blog.bit.io/ai-powered-text-to-sql-translation-in-bit-io-1fbcf32fd586 --- ###https://medium.com/@huyphams/dataflow-a-new-toy-for-creating-internal-tools-defc62640ec1 --- ###https://icepanel.medium.com/top-8-diagramming-tools-for-software-architecture-2fc61d095b93 --- ###https://www.pcgamer.com/nvidia-predicts-ai-models-one-million-times-more-powerful-than-chatgpt-within-10-years/ --- ###A nice article on online gradient descent in SQL! https://maxhalford.github.io/blog/ogd-in-sql/ Here is my Colab notebook: https://drive.google.com/file/d/1LiC1l64Tfw7mJ-u1VYogBbB1Tj262zBd/view?usp=sharing --- ###NN-based hash functions! https://techxplore.com/news/2023-03-method-boost-online-databases.html --- ###Dynaboard - browser-based IDE for building web apps: https://dynaboard.com/features --- ###<a href="clips/Gathr.mp4">'Gathr'</a> --- ###From OpenAI's own Lilian Weng: https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/ --- ###Chat - a new UI widget: https://openai.com/blog/chatgpt-plugins --- ###https://www.quantamagazine.org/bob-metcalfe-ethernet-pioneer-wins-turing-award-20230322/ --- ###https://www.fermyon.com/blog/introducing-spin-v1 --- ###https://www.cncf.io/phippy/ --- ###https://medium.com/kontur-inc/kontur-ui-kit-68534ee45ba2 [more: https://www.kontur.io/] --- ###https://blogs.oracle.com/connect/post/mysql-heatwave-lakehouse-brings-transactional-and-semistructured-data-into-one-high-speed-query-engine - a lakehouse product from Oracle [who maintains MySQL] --- ###https://www.tpc.org/tpc_documents_current_versions/pdf/tpc-c_v5.11.0.pdf --- ###https://www.linkedin.com/advice/3/how-do-you-use-hashing-detect-plagiarism-academic-papers --- ###<a href="https://www.totaljs.com/flow/">Total.js Flow</a> - a dataflow system - use <a href="pics/totaljs_flow.png">Docker</a> to <a href="clips/totaljs_flow.mp4">run</a> it :) It has (comes with) LOTS of data-related components/nodes/boxes/operators. --- ### 'Generative anything' - a data analysis-oriented example: https://www.einblick.ai/prompt --- ### A glitchy, spammy... <a href="docs/GlitchySpammy.pdf">Internet</a>... ### Airflow - a node-based (dataflow) pipeline: https://theaisummer.com/apache-airflow-tutorial --- ###'Generators' are always fascinating - they are the SOURCE of data, geometry, pixels, audio... Here is a triangle waveform generator: https://www.edn.com/single-ic-forms-precision-triangular-wave-generator/ [you can hear what it might sound like, by doing Const -> Tri -> AudioOut, at https://noisecraft.app/ :)]. --- ###https://sustainabilitydata.usc.edu/arcgis/apps/sites/#/usc-sustainability-data-hub and https://storymaps.arcgis.com/stories/0f01047c88554a219980030773569959 - two USC undergrad projects :) --- ###https://medium.com/neo4j/context-aware-knowledge-graph-chatbot-with-gpt-4-and-neo4j-d3a99e8ae21e --- ###https://www.sfchronicle.com/sf/article/survey-crime-san-francisco-17894081.php --- ###https://bdtechtalks.com/2023/04/17/open-source-chatgpt-alternatives/ --- ###LOL: https://futurism.com/the-byte/smart-glasses-gpt-4 --- ###https://www.uscc.gov/research/shein-temu-and-chinese-e-commerce-data-risks-sourcing-violations-and-trade-loopholes --- ###https://thenewstack.io/distributed-database-architecture-what-is-it/ --- ###https://www.theverge.com/2023/4/20/23691468/google-ai-deepmind-brain-merger --- ###https://www.latent.space/p/agents --- ###https://towardsdatascience.com/top-10-pre-trained-models-for-image-embedding-every-data-scientist-should-know-88da0ef541cd --- ###https://restofworld.org/2023/ai-image-china-video-game-layoffs/ --- ###<a href="docs/Global-Economics-Analyst_-The-Potentially-Large-Effects-of-Artificial-Intelligence-on-Economic-Growth-Briggs_Kodnani.pdf">Here</a> is a report from Goldman Sachs. --- ###https://webutility.io/csv-to-insert-sql-online [omg!] --- ###https://cacm.acm.org/magazines/2023/4/271229-artificial-intelligence-for-materials-discovery/fulltext --- ###'OI': https://spectrum.ieee.org/organoid-intelligence-computing-on-brain ; my take on **structure-based** intelligence: https://www.researchgate.net/publication/358886020_A_Physical_Structural_Perspective_of_Intelligence --- ###https://thegradient.pub/software2-a-new-generation-of-ais-that-become-increasingly-general-by-producing-their-own-training-data/ --- ###https://github.com/booydar/t5-experiments/tree/scaling-report --- ###https://arxiv.org/abs/2304.09349 [LLM-Brain] --- ###How to probe GPT to reveal its most fundamental flaw - that it understands NOTHING (!): https://medium.com/@shlomi.sher/on-artifice-and-intelligence-f19224281bee --- ###'Generative anything' - https://javascript.plainenglish.io/bill-gates-people-dont-realize-what-s-coming-dc06d3b81c9d --- ###https://towardsdatascience.com/meta-ai-introduces-revolutionary-image-segmentation-model-trained-on-1-billion-masks-8f13c86a13a2 --- ###Decompose -> train -> recompose: https://bootcamp.uxdesign.cc/introducing-composer-the-latest-breakthrough-in-ai-image-generation-9a2350e2b9a0 --- ###LangChain: https://mohitmayank.medium.com/creating-gpt-driven-applications-using-langchain-a6ee08c383d4 --- ###Jupyter->webapp: https://towardsdatascience.com/build-elegant-web-apps-right-from-jupyter-notebook-with-mercury-78d9ebcbbcaf --- ###https://javascript.plainenglish.io/my-boss-front-end-development-will-be-replaced-100-by-ai-354d79c79b5b --- ###https://medium.com/codingthesmartway-com-blog/discover-thinkgpt-the-cutting-edge-python-library-that-transforms-ai-into-a-powerful-thinking-c7e588bd28b4 --- ###'Netwerk' diagram: https://observablehq.com/blog/r-to-javascript-shapiro --- ###There goes the 'Prompt Engineering' job :) https://arxiv.org/abs/2211.01910 --- ###'Vector native' DBs: https://innerjoin.bit.io/why-you-should-care-about-vector-databases-1760186b5bf1 [look for this link in the article, where you can play with vector search: https://innerjoin.bit.io/vector-similarity-search-in-postgres-with-bit-io-and-pgvector-c58ac34f408b]. --- ###https://ai.facebook.com/blog/ai-dataset-animation-drawings --- ###https://the-decoder.com/nvidia-shows-text-to-video-for-stable-diffusion/ --- ###https://datatables.net/examples/ [row-col data --> HTML table] --- ###https://discord.com/blog/how-discord-stores-trillions-of-messages (thanks, Akash!) --- <!-- During NoSQL: https://kapernikov.com/writing-a-high-quality-data-pipeline-for-master-data-with-apache-spark-part-1/ https://kapernikov.com/writing-a-high-quality-data-pipeline-for-master-data-with-apache-spark-part-2/ During DM: https://kanaries.net/ [RATH] docs/XGBoost.pdf During ML: https://github.com/salesforce/LAVIS and https://arxiv.org/pdf/2003.13230.pdf (Alibaba concept net) https://www.forbes.com/sites/robtoews/2023/02/07/the-next-generation-of-large-language-models/ https://github.com/ahmedbahaaeldin/From-0-to-Research-Scientist-resources-guide https://towardsdatascience.com/beautifully-illustrated-nlp-models-from-rnn-to-transformer-80d69faf2109 https://dwiuzila.medium.com/list/machine-learning-from-scratch-b35db8650093 deeplake.ai https://udlbook.github.io/udlbook/ https://arxiv.org/abs/2210.05189 NNs are dec trees http://incompleteideas.net/book/RLbook2020.pdf ML_Cheatsheet_1678112002110.pdf in shared_docs/ https://sites.mitre.org/aifails/about-us/ https://ai.facebook.com/blog/robots-learning-video-simulation-artificial-visual-cortex-vc-1 'data' https://www.techtarget.com/searchenterpriseai/definition/BERT-language-model https://towardsdatascience.com/bert-for-dummies-step-by-step-tutorial-fb90890ffe03 https://medium.com/geekculture/list-of-open-sourced-fine-tuned-large-language-models-llm-8d95a2e0dc76 https://www.frontiersin.org/journals/science/articles/10.3389/fsci.2023.1017235 OI https://finbarr.ca/five-years-of-gpt-progress/ - LLMs During viz: http://bridges-cs.herokuapp.com/assignments/1004/bridges_public During review https://leerob.io/blog/backend - great overview! Plus: 'Seattle DB Report'. https://mad.firstmarkcap.com/ and https://mattturck.com/mad2023/ --> <!-- ********************************** --> <br><br>