{ "cells": [ { "cell_type": "markdown", "id": "24e7652f", "metadata": {}, "source": [ "# \"Python für ML\" Kurzeinführung" ] }, { "cell_type": "code", "execution_count": 1, "id": "92bb5be1", "metadata": {}, "outputs": [], "source": [ "# imports überall im Code möglich, aber die Konvention ist alle benötigten import statements\n", "# gleich zu Beginn einer Datei zu machen\n", "\n", "# numpy ist ein Python-Modul für Numerik, das sowohl Funktionalität als auch Effizienz bietet\n", "import numpy as np\n", "\n", "# pandas ist sehr gut zum Arbeiten mit tabellarischen Daten, egal ob csv, xls oder xlsx\n", "import pandas as pd\n", "\n", "# plotting settings\n", "pd.plotting.register_matplotlib_converters()\n", "\n", "# matplotlib ist ein sehr umfangreiches Modul zum Erstellen von Visualisierungen/Plots\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "# seaborn erleichtert das Erstellen von oft verwendeten Plot-Typen;\n", "# es basiert selbst auf matplotlib und man kann beides kombinieren\n", "# eine schöne Einführung in Seaborn: https://www.kaggle.com/learn/data-visualization\n", "import seaborn as sns" ] }, { "cell_type": "markdown", "id": "0562db47", "metadata": {}, "source": [ "Es gibt verschiedene Zelltypen in Jupyter - Code oder Markdown. Mit Markdown kann man den Code schöner dokumentieren als durch Kommentare im Code selbst. Es sind *verschiedene* **Formatierungen** und sogar LaTeX-ähnliche mathematische Formeln möglich. Sowohl inline ($h_\\theta(x) = \\theta^Tx$) als auch zentriert in separaten Zeilen:\n", "\n", "$$h_\\theta(x) = \\theta^Tx$$\n", "\n", "
HTML wird ebenfalls erkannt.
\n", "\n", "Wir laden jetzt eine CSV-Datei mit Pandas:" ] }, { "cell_type": "markdown", "id": "55f91177", "metadata": {}, "source": [ "## Daten laden" ] }, { "cell_type": "code", "execution_count": 2, "id": "724b4875", "metadata": {}, "outputs": [], "source": [ "data_file_path = '../data/exam-iq.csv'\n", "data = pd.read_csv(data_file_path)\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "b5661e8d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " | Pass | \n", "Hours | \n", "IQ | \n", "
---|---|---|---|
0 | \n", "0 | \n", "0.50 | \n", "110 | \n", "
1 | \n", "0 | \n", "0.75 | \n", "95 | \n", "
2 | \n", "0 | \n", "1.00 | \n", "118 | \n", "
3 | \n", "0 | \n", "1.25 | \n", "97 | \n", "
4 | \n", "0 | \n", "1.50 | \n", "100 | \n", "
5 | \n", "0 | \n", "1.75 | \n", "110 | \n", "
6 | \n", "0 | \n", "1.75 | \n", "115 | \n", "
7 | \n", "1 | \n", "2.00 | \n", "104 | \n", "
8 | \n", "1 | \n", "2.25 | \n", "120 | \n", "
9 | \n", "0 | \n", "2.50 | \n", "98 | \n", "
10 | \n", "1 | \n", "2.75 | \n", "118 | \n", "
11 | \n", "0 | \n", "3.00 | \n", "88 | \n", "
12 | \n", "1 | \n", "3.25 | \n", "108 | \n", "
13 | \n", "1 | \n", "4.00 | \n", "109 | \n", "
14 | \n", "1 | \n", "4.25 | \n", "110 | \n", "
15 | \n", "1 | \n", "4.50 | \n", "112 | \n", "
16 | \n", "1 | \n", "4.75 | \n", "97 | \n", "
17 | \n", "1 | \n", "5.00 | \n", "102 | \n", "
18 | \n", "1 | \n", "5.50 | \n", "109 | \n", "
19 | \n", "0 | \n", "3.50 | \n", "125 | \n", "
\n", " | Pass | \n", "Hours | \n", "IQ | \n", "
---|---|---|---|
0 | \n", "0 | \n", "0.50 | \n", "110 | \n", "
1 | \n", "0 | \n", "0.75 | \n", "95 | \n", "
2 | \n", "0 | \n", "1.00 | \n", "118 | \n", "
3 | \n", "0 | \n", "1.25 | \n", "97 | \n", "
4 | \n", "0 | \n", "1.50 | \n", "100 | \n", "
\n", " | Suburb | \n", "Address | \n", "Rooms | \n", "Type | \n", "Price | \n", "Method | \n", "SellerG | \n", "Date | \n", "Distance | \n", "Postcode | \n", "... | \n", "Bathroom | \n", "Car | \n", "Landsize | \n", "BuildingArea | \n", "YearBuilt | \n", "CouncilArea | \n", "Lattitude | \n", "Longtitude | \n", "Regionname | \n", "Propertycount | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | \n", "Abbotsford | \n", "25 Bloomburg St | \n", "2 | \n", "h | \n", "1035000.0 | \n", "S | \n", "Biggin | \n", "4/02/2016 | \n", "2.5 | \n", "3067.0 | \n", "... | \n", "1.0 | \n", "0.0 | \n", "156.0 | \n", "79.0 | \n", "1900.0 | \n", "Yarra | \n", "-37.8079 | \n", "144.9934 | \n", "Northern Metropolitan | \n", "4019.0 | \n", "
2 | \n", "Abbotsford | \n", "5 Charles St | \n", "3 | \n", "h | \n", "1465000.0 | \n", "SP | \n", "Biggin | \n", "4/03/2017 | \n", "2.5 | \n", "3067.0 | \n", "... | \n", "2.0 | \n", "0.0 | \n", "134.0 | \n", "150.0 | \n", "1900.0 | \n", "Yarra | \n", "-37.8093 | \n", "144.9944 | \n", "Northern Metropolitan | \n", "4019.0 | \n", "
4 | \n", "Abbotsford | \n", "55a Park St | \n", "4 | \n", "h | \n", "1600000.0 | \n", "VB | \n", "Nelson | \n", "4/06/2016 | \n", "2.5 | \n", "3067.0 | \n", "... | \n", "1.0 | \n", "2.0 | \n", "120.0 | \n", "142.0 | \n", "2014.0 | \n", "Yarra | \n", "-37.8072 | \n", "144.9941 | \n", "Northern Metropolitan | \n", "4019.0 | \n", "
6 | \n", "Abbotsford | \n", "124 Yarra St | \n", "3 | \n", "h | \n", "1876000.0 | \n", "S | \n", "Nelson | \n", "7/05/2016 | \n", "2.5 | \n", "3067.0 | \n", "... | \n", "2.0 | \n", "0.0 | \n", "245.0 | \n", "210.0 | \n", "1910.0 | \n", "Yarra | \n", "-37.8024 | \n", "144.9993 | \n", "Northern Metropolitan | \n", "4019.0 | \n", "
7 | \n", "Abbotsford | \n", "98 Charles St | \n", "2 | \n", "h | \n", "1636000.0 | \n", "S | \n", "Nelson | \n", "8/10/2016 | \n", "2.5 | \n", "3067.0 | \n", "... | \n", "1.0 | \n", "2.0 | \n", "256.0 | \n", "107.0 | \n", "1890.0 | \n", "Yarra | \n", "-37.8060 | \n", "144.9954 | \n", "Northern Metropolitan | \n", "4019.0 | \n", "
5 rows × 21 columns
\n", "