Python is a great languge for doing data science and AI, a comprehensive list of features is available in book form. Please note that when installing Python, you always want to use a venv as this is best practice.
Best practices in Python recommend to use a Python venv. This is pretty easy to do and creates a separate Python environment for you so you do not interfere with your system Python installation. Some IDEs may do this automatically, but it is still best practice to install one and bind the IDE against it. To do this:
Download Python version 3.9.5 just as shown in the first lecture.
After the download you do an additional step as follows:
you need to do the source every time you start a new window or on mac ass it to .zprofile
on Windows you first install gitbash and do all yuour terminal work from gitbash as this is more Linux-like. In gitbash, run
python -m venv ~/ENV3
~/ENV/Script/activate
In case you like to add it to gitbash, you can add the source line to .bashrc and/or .bash_profile
In case you use VSCode, you can also do it individually in a directory where you have your code.
On Mac: cd TO YOUR DIR; python3.9 -m venv .
On Windows cd TO YOUR DIR; python -m venv .
Then start VSCode in the directory and it will ask you to use this venv. However, the global ENV3 venv
may be better and you cen set your interpreter to it.
On Pycharm we recommend you use the ENV3 and set the clobal interpreter
The first exercise will require a simple for loop, while the second is more complicated, requiring nested for loops and a break statement.
General Instructions: Create two different files with extension .ipnyb, one for each problem. The first file will be named factorial.ipnyb which is for the factorial problem, and the second prime_number.ipnyb for the prime number problem.
Write a program that can find the factorial of any given number. For example, find the factorial of the number 5 (often written as 5!) which is 12345 and equals 120. Your program should take as input an integer from the user.
Note: The factorial is not defined for negative numbers and the factorial of Zero is 1; that is 0! = 1.
You should
If the number is less than Zero return with an error message.
Check to see if the number is Zero—if it is then the answer is 1—print this out.
Otherwise use a loop to generate the result and print it out.
A Prime Number is a positive whole number, greater than 1, that has no other divisors except the number 1 and the number itself. That is, it can only be divided by itself and the number 1, for example the numbers 2, 3, 5 and 7 are prime numbers as they cannot be divided by any other whole number. However, the numbers 4 and 6 are not because they can both be divided by the number 2 in addition the number 6 can also be divided by the number 3.
You should write a program to calculate prime number starting from 1 up to the
value input by the user.
You should
If the user inputs a number below 2, print an error message.
For any number greater than 2 loop for each integer from 2 to that number and determine if it can be divided by another number (you will probably need two for loops for this; one nested inside the other).
For each number that cannot be divided by any other number (that is its a prime number) print it out.
This course introduces the students to AI-First principles. The notes are prepared for the course taught in 2021.
This course introduces the students to AI-First Engineering Cybertraining we provide the following sections
Class Material
As part of this class, we will be using a variety of sources. To
simplify the presentation we provide them in a variety of smaller
packaged material including books, lecture notes, slides, presentations
and code.
Note: We will regularly update the course material, so please
always download the newest version. Some browsers try to
be fancy and cache previous page visits. So please make sure to
refresh the page.
This course is built around the revolution driven by AI and in particular deep learning that is transforming all activities: industry, research, and lifestyle. It will a similar structure to The Big Data Class and the details of the course will be adapted to the interests of participating students. It can include significant deep learning programming.
All activities – Industry, Research, and Lifestyle – are being transformed by Artificial Intelligence AI and Big Data. AI is currently dominated by deep learning implemented on a global pervasive computing environment - the global AI supercomputer. This course studies the technologies and applications of this transformation.
We review Core Technologies driving these transformations: Digital transformation moving to AI Transformation, Big Data, Cloud Computing, software and data engineering, Edge Computing and Internet of Things, The Network and Telecommunications, Apache Big Data Stack, Logistics and company infrastructure, Augmented and Virtual reality, Deep Learning.
There are new “Industries” over the last 25 years: The Internet, Remote collaboration and Social Media, Search, Cybersecurity, Smart homes and cities, Robotics. However, our focus is Traditional “Industries” Transformed: Computing, Transportation: ride-hailing, drones, electric self-driving autos/trucks, road management, travel, construction Industry, Space, Retail stores and e-commerce, Manufacturing: smart machines, digital twins, Agriculture and Food, Hospitality and Living spaces: buying homes, hotels, “room hailing”, Banking and Financial Technology: Insurance, mortgage, payments, stock market, bitcoin, Health: from DL for pathology to personalized genomics to remote surgery, Surveillance and Monitoring: – Civilian Disaster response; Miltary Command and Control, Energy: Solar wind oil, Science; more data better analyzed; DL as the new applied mathematics, Sports: including Sabermetrics, Entertainment, Gaming including eSports, News, advertising, information creation and dissemination, education, fake news and Politics, Jobs.
We select material from above to match student interests.
Students can take the course in either software-based or report-based mode. The lectures with be offered in video form with a weekly discussion class. Python and Tensorflow will be main software used.
This course does not require you to do much Linux. However, if you do
need it, we recommend the following as starting point listed
The most
elementary Linux features can be learned in 12 hours. This includes
bash, editor, directory structure, managing files. Under Windows, we
recommend using gitbash, a terminal with all the
commands built-in that you would need for elementary work.
You can contribute to the material with useful links and sections that
you find. Just make sure that you do not plagiarize when making
contributions. Please review our guide on plagiarism.
Computer Needs
This course does not require a sophisticated computer. Most of the
things can be done remotely. Even a Raspberry Pi with 4 or 8GB could
be used as a terminal to log into remote computers. This will cost you
between $50 - $100 dependent on which version and equipment. However,
we will not teach you how to use or set up a Pi or another
computer in this class. This is for you to do and find out.
In case you need to buy a new computer for school, make sure the
computer is upgradable to 16GB of main memory. We do no longer
recommend using HDD’s but use SSDs. Buy the fast ones, as not every
SSD is the same. Samsung is offering some under the EVO Pro
branding. Get as much memory as you can effort. Also, make sure
you back up your work regularly. Either in online storage such as
Google, or an external drive.
2.1 - Project Guidelines
We present here the AI First Engineering project guidelines
We present here the project guidelines
All students of this class are doing a software project. (Some of our classes allow non software projects)
The final project is 50% of grade
All projects must have a well-written report as well as the software
component
We must be able to run software from class GitHub repository. To do so you must include an appendix to your project
report describing how to run your project.
If you use containers you must decsribe how to create them from Docker files.
If you usue ipy notebooks you must include a button or links so it can be run in Google collab
There are a useful set of example projects submitted in previous
classes
In this class you do not have the option to work on a joint report, however you can collaborate.
Note: all reports and projects are open for everyone as they are open source.
Details
The major deliverable of the course is a software project with a report. The
project must include a programming part to get a full grade. It is
expected that you identify a suitable analysis task and data set for the
project and that you learn how to apply this analysis as well as to
motivate it. It is part of the learning outcome that you determine this
instead of us giving you a topic. This topic will be presented by student in class April 1.
It is desired that the project has a novel feature in it. A project
that you simply reproduce may not recieve the best grade, but this depends
on what the analysis is and how you report it.
However “major advances” and solving of a full-size problem are not required. You can simplify both network and dataset to be able to complete project.
The project write-up should describe the “full-size” realistic problem with software exemplifying an instructive example.
One goal of the class is to use open source technology wherever
possible. As a beneficial side product of this, we are able to
distribute all previous reports that use such technologies. This means
you can cite your own work, for example, in your resume.
For big data, we have more than 1000 data sets we point to.
Comments on Example Projects from previous classes
Warning: Please note that we do not make any quality assumptions to the
published papers that we list here. It is up to you to identify
outstanding papers.
Warning: Also note that these activities took place in previous classes,
and the content of this class has since been updated or the focus has shifted.
Especially chapters on Google Colab, AI, DL have been added to the course after the
date of most projects. Also, some of the documents include an additional
assignment called Technology review. These are not the same as the
Project report or review we refer to here. These are just assignments
done in 2-3 weeks. So please do not use them to identify a comparison
with your own work. The activities we ask from you are substantially
more involved than the technology reviews.
Format of Project
Plagiarism is of course not permitted. It is your responsibility
to know what plagiarism is. We provide a detailed description book about it here, you can also do the IU
plagiarism test to learn more.
All project reports must be provided in github.com as a markdown file.
All images must be in an images directory. You must use proper
citations. Images copied from the Internet must have a citation in the
Image caption. Please use the IEEE citation format and do not use
APA or harvard style. Simply use fotnotes in markdown but treat them as
regular citations and not text footnotes (e.g. adhere to the IEEE rules).
All projects and reports must
be checked into the Github repository. Please take a look at the example we created for you.
The report will be stored in the github.com.
./project/index.md
./project/images/mysampleimage.png
Length of Project Report
Software Project Reports: 2500 - 3000 Words.
Possible sources of datasets
Given next are links to collections of datasets that may be of use for
homework assignments or projects.
Why you should not just paste and copy into the GitHub GUI?
We may make comments directly in your markdown or program files. If you just paste and copy you may overlook such comments. HEns only paste and copy small paragraphs. If you need to. The best way of using github is from commandline and using editors such as pycharm and emacs.
I like to do a project that relates to my company?
Please go ahead and do so but make sure you use open-source
data, and all results can be shared with everyone. If that is
not the case, please pick a different project.
Can I use Word or Google doc, or LaTeX to hand in the final
document?
No. you must use github.com and markdown.
Please note that exporting documents from word or google docs
can result in a markdown file that needs substantial cleanup.
Where do I find more information about markdown and plagiarism
There are many online markdown editors available. One of them is
[https://dillinger.io/]{.ul}.
Use them to write your document or check the one you have
developed in another editor such as word or google docs.
Remember, online editors can be dangerous in case you lose
network connection. So we recommend to develop small portions
and copy them into a locally managed document that you then
check into github.com.
Github GUI (recommended): this works very well, but the
markdown is slightly limited. We use hugo’s markdown.
pyCharm (recommended): works very well.
emacs (recommended): works very well
What level of expertise and effort do I need to write markdown?
We taught 10-year-old students to use markdown in less than 5
minutes.
What level of expertise is needed to learn BibTeX
We have taught BibTeX to inexperienced students while using
jabref in less than an hour (but it is not required for this
course). You can use footnotes while making sure that the
footnotes follow the IEEE format.
How can I get IEEE formatted footnotes?
Simply use jabref and paste and copy the text it produces.
Note: All URL’s must be either in [TEXT](URLHERE) or
<URLHERE> format.
3 - Big Data 2020
This course introduces the students to Cloud Big Data Applications. The notes are prepared for the course taught in 2020.
This course introduces the students to Cloud Big Data
Applications we provide the following sections
Class Material
As part of this class, we will be using a variety of sources. To
simplify the presentation we provide them in a variety of smaller
packaged material including books, lecture notes, slides, presentations
and code.
Note: We will regularly update the course material, so please
always download the newest version. Some browsers try to
be fancy and cache previous page visits. So please make sure to
refresh the page.
This course does not require you to do much Linux. However, if you do
need it, we recommend the following as starting point listed
The most
elementary Linux features can be learned in 12 hours. This includes
bash, editor, directory structure, managing files. Under Windows, we
recommend using gitbash, a terminal with all the
commands built-in that you would need for elementary work.
You can contribute to the material with useful links and sections that
you find. Just make sure that you do not plagiarize when making
contributions. Please review our guide on plagiarism.
Computer Needs
This course does not require a sophisticated computer. Most of the
things can be done remotely. Even a Raspberry Pi with 4 or 8GB could
be used as a terminal to log into remote computers. This will cost you
between $50 - $100 dependent on which version and equipment. However,
we will not teach you how to use or set up a Pi or another
computer in this class. This is for you to do and find out.
In case you need to buy a new computer for school, make sure the
computer is upgradable to 16GB of main memory. We do no longer
recommend using HDD’s but use SSDs. Buy the fast ones, as not every
SSD is the same. Samsung is offering some under the EVO Pro
branding. Get as much memory as you can effort. Also, make sure
you back up your work regularly. Either in online storage such as
Google, or an external drive.
4 - REU 2020
This course introduces the REU students to various topics in Intelligent Systems Engineering. The course was taught in Summer 2020.
This course introduces the REU students to various topics in Intelligent Systems Engineering. The course was taught in Summer 2020.
Computational Foundations
Brief Overview of the Praxis AI Platform and Overview of the Learning Paths
Accessing Praxis Cloud
Introduction To Linux and the Command Line
Jupyter Notebooks
A Brief Intro to Machine Learning in Google Colaboratory
Jupyter notebook on Google Colab for COVID-19 data analysisipynb
Follow-up on Discussion of AI remaking Industry worldwide
Class on AI First Engineering with 35 videos describing technologies
and particular industries Commerce, Mobility, Banking, Health,
Space, Energy in
detail (youtube playlist)
Introductory Video (one of 35) discussing the Transformation -
Industries invented and remade through AI
(youtube)
AI4ESS Summer School: Anyone who is interested in using Machine
learning for climate science research I highly recommend you
register for the Artificial Intelligence for Earth System
Science summer school & interactive workshops which conveniently
runs June 22^nd^ to 26^th^. Prior Experience with tensorflow/keras
via google co-lab should be all the introductory skill needed to
follow along. Register ASAP.
https://www2.cisl.ucar.edu/events/summer-school/ai4ess/2020/artificial-intelligence-earth-system-science-ai4ess-summer-school
Y. M. Bar-On, A. I. Flamholz, R. Phillips, and R. Milo,
“SARS-CoV-2 (COVID-19) by the numbers,” arXiv [q-bio.OT],
28-Mar-2020. http://arxiv.org/abs/2003.12886↩︎
A. Adiga, L. Wang, A. Sadilek, A. Tendulkar, S. Venkatramanan, A.
Vullikanti, G. Aggarwal, A. Talekar, X. Ben, J. Chen, B. Lewis, S.
Swarup, M. Tambe, and M. Marathe, “Interplay of global multi-scale human
mobility, social distancing, government interventions, and COVID-19
dynamics,” medRxiv - Public and Global Health, 07-Jun-2020.
http://dx.doi.org/10.1101/2020.06.05.20123760↩︎
D. Machi, P. Bhattacharya, S. Hoops, J. Chen, H. Mortveit, S.
Venkatramanan, B. Lewis, M. Wilson, A. Fadikar, T. Maiden, C. L.
Barrett, and M. V. Marathe, “Scalable Epidemiological Workflows to
Support COVID-19 Planning and Response,” May 2020. ↩︎
[Robert Marsland and Pankaj Mehta, “Data-driven modeling reveals a
universal dynamic underlying the COVID-19 pandemic under social
distancing,” arXiv [q-bio.PE], 21-Apr-2020.
http://arxiv.org/abs/2004.10666↩︎
T. J. Sego, J. O. Aponte-Serrano, J. F. Gianlupi, S. Heaps, K.
Breithaupt, L. Brusch, J. M. Osborne, E. M. Quardokus, and J. A.
Glazier, “A Modular Framework for Multiscale Spatial Modeling of Viral
Infection and Immune Response in Epithelial Tissue,” BioRxiv, 2020.
https://www.biorxiv.org/content/10.1101/2020.04.27.064139v2.abstract↩︎
Yafei Wang, Gary An, Andrew Becker, Chase Cockrell, Nicholson
Collier, Morgan Craig, Courtney L. Davis, James Faeder, Ashlee N. Ford
Versypt, Juliano F. Gianlupi, James A. Glazier, Randy Heiland, Thomas
Hillen, Mohammad Aminul Islam, Adrianne Jenner, Bing Liu, Penelope A
Morel, Aarthi Narayanan, Jonathan Ozik, Padmini Rangamani, Jason Edward
Shoemaker, Amber M. Smith, Paul Macklin, “Rapid community-driven
development of a SARS-CoV-2 tissue simulator,” BioRxiv, 2020.
https://www.biorxiv.org/content/10.1101/2020.04.02.019075v2.abstract↩︎
Gagne II, D. J., S. E. Haupt, D. W. Nychka, and G. Thompson, 2019:
Interpretable Deep Learning for Spatial Analysis of Severe Hailstorms.
Mon. Wea. Rev., 147, 2827–2845,
https://doi.org/10.1175/MWR-D-18-0316.1↩︎
8 - Intelligent Systems
This book introduces you to the concepts used to build Intelligent Systems.
You will learn here about using Linux while focussing mostly on shell command line usage.
Linux will be used on many computers to develop and interact with cloud services. Especially popular are the command line tools that even exist on Windows. Thus we can have a uniform environment on all platforms using the bash shell.
For ePub, we recommend using iBooks on MacOS and calibre on all other systems.
Topics covered include:
Linux Shell
Perl one liners
Refcards
SSH
keygen
agents
port forwarding
Shell on Windows
ZSH
10 - Markdown
Show your user how to work through some end to end examples.
An important part of any scientific research is to communicate and
document it. Previously we used LaTeX in this class to provide the
ability to contribute professional-looking documents. However, here we
will describe how you can use markdown to create scientific documents.
We use markdown also on the Web page.
The document is available as an online book in
ePub
and
PDF
For ePub, we recommend using iBooks on macOS and calibre on all other systems.
Topics covered include:
Plagiarism
Writing Scientific Articles
Markdown (Pandoc format)
Markdown for presentations
Writing papers and reports with markdown
Emacs and markdown as an editor
Graphviz in markdown
11 - OpenStack
You will have the opportunity to learn more about OpenStack. OpenStack is a Cloud toolkit allowing you to do Bare metal and virtual machine provisioning. Show your user how to work through some end to end examples.
OpenStack is usable via command line tools and REST APIs. YOu will be able to experiment with it on Chameleon Cloud.
OpenStack with Chameleon Cloud
We have put together from the chameleon cloud manual a subset of
information that is useful for using OpenStack. This focusses mostly
on Virtual machine provisioning. The reason we put our own
documentation here is to promote more secure utilization of Chameleon
Cloud.
Additional material on how to uniformly access OpenStack via a multicloud command line tool is available at:
We highly recommend you use the multicloud environment as it will
allow you also to access AWS, Azure, Google, and other clouds from the
same command line interface.
The Chameleon Cloud document is availanle as online book in ePub and PDF from the
following Web Page:
For ePub, we recommend using iBooks on MacOS and calibre on all other systems.
Topics covered include:
Using Chameleoncloud more securely
Resources
Hardware
Charging
Getting STarted
Virtual Machines
Commandline Interface
Horizon
Heat
Bare metal
FAQ
12 - Python
You will find here information about learning the Python Programming language and learn about its ecosystem.
Python is an easy to learn programming language. It has efficient
high-level data structures and a simple but effective approach to
object-oriented programming. Python’s simple syntax and dynamic
typing, together with its interpreted nature, make it an ideal
language for scripting and rapid application development in many areas
on most platforms.
Introduction to Python
This online book will provide you with enough information to conduct your programming for the cloud in python. Although this
the introduction was first developed for Cloud Computing related classes,
it is a general introduction suitable for other classes.
The document is available as an online book in
ePub
and
PDF
For ePub, we recommend using iBooks on macOS and calibre on all other systems.
Topics covered include:
Python Installation
Using Multiple different Python Versions
First Steps
REPL
Editors
Google Colab
Python Language
Python Modules
Selected Libraries
Python Cloudmesh Common Library
Basic Matplotlib
Basic Numpy
Python Data Management
Python Data Formats
Python MongoDB
Parallelism in Python
Scipy
Scikitlearn
Elementary Machine Learning
Dask
Applications
Fingerprint Matching
Face Detection
13 - MNIST Classification on Google Colab
In this mini-course, you will learn how to use Google Colab while using the well known MNIST example
We discuss in this module how to create a simple IPython Notebook to
solve an image classification problem. MNIST contains a set of
pictures.
Prerequisite
Knowledge of Python
Google account
Effort
1 hour
Topics covered
Using Google Colab
Running an AI application on Google Colab
1. Introduction to Google Colab
This module will introduce you to how to use Google Colab to run deep learning models.