The horse

technicalelvis · The Horse

I finally got my Fodera after 13 months! It was tricky to EQ. Now I know why most session musicians use a Fender Precision bass. 🙂

I'm experimenting with a new writing process where I write a 4-measure bass line then use Logic Pro instruments to complete the song. It takes me too long to write verse, chorus, bridge and other parts on bass. Hopefully, reducing the work will enable me to complete songs more consistently.

I'm also trying to get better at mixing tracks. I tried out a recommendation to bring all the tracks to -18db before mixing then adjusting beneath that ceiling. Then gain, eq, compression, limit on the stereo out channel. I like how it sounds.

Kaggle Credit Risk Competition

Kaggle Competition Goal

Detect which loans are at risk of default using credit application data and 3rd party credit data.

My Approach

Fetch the Kaggle competition data from the Home Credit Default Risk Competition, generate numeric and categorical features then build models using Tensorflow, Scikit-Learn and XGBoost.

Github

See my kaggle_credit_risk github repo to view the source for generating features, training models and running model experiments.

Jumping in the leaves

Inspired by watching a slow-motion video of my kids jumping into a pile of leaves.

The track uses fretted bass in the "lead" track with "Phantom Tremelo" guitar effect. Fretless bass in the "bass" track with a little chorus after the bridge. MIDI Moonlight Ark synths in the background. MIDI Hypnotic Synth arpeggiator and Delicate Bells in the bridge.

Data 101: MySQL Tutorial using a Diabetes Database

Introduction

This tutorial demonstrates how to use MySQL and MySQL Workbench to create and explore a MYSQL database containing diabetes treatment records.

Original Dataset

I generated the SQL for this tutorial using the Diabetes Data Set from the UCI Machine Learning Repository. I added a "patients" table using random celebrity names.

MySQL Setup

Connecting to the Database

Open MySQL Workbench. Click Database -> Connect to Database

Connect to a MySQL Database Menue
Connect to a MySQL Database via Menu

The local MySQL server should be running on Hostname: 127.0.0.1 and Port 3306. Your hostname or IP address may be different if you are connecting to another host running MySQL. Click on “Password” to enter the password generated during installation.

Enter IP and Password for MySQL database.
Enter IP and Password for MySQL database.

Save the password here:

Save your password (MacOS)
Save your password (MacOS)

Now that you connected to the server, list the installed databases using.

SHOW SCHEMAS

You should see the following output similar to this:

MySQL Workbench Output of Show Schema
MySQL Workbench Output of Show Schema

Creating the Diabetes Database

In MySQL workbench, execute “create_diabetes_db.sql” using “File -> Open SQL Script”.

"Open SQL Script" Menu
"Open SQL Script" Menu

You can find this file in my github within the medical_databases/sql/ directory.

Open "create_diabetes_db.sql"
Open "create_diabetes_db.sql"

Once the file is open, use Command-A to “Select ALL” - or (Edit -> Select ALL).

"Select All" SQL in window
"Select All" SQL in window

Now Click the Execute (leftmost lightning bolt) button. The “Action Output” window should show:

Output from "create db" script
Output from "create db" script

Loading the Diabetes Data into the Database

Now open “diabetes_data.sql”. Click here to download that file from github.

Open "diabetes_data.sql"
Open "diabetes_data.sql"

Once the file is open, use Command-A to “Select ALL” - or (Edit -> Select ALL). This is a big file so you’ll only see the beginning part in the window.

"Select All" Diabetes data SQL
"Select All" Diabetes data SQL

Now Click the Execute button (leftmost lightning bolt). The “Action Output” window should show something similar to the output below. There are lot of commands in this file but you should see some “CREATE TABLE” and “INSERTS” statements.

Output from executing diabetes data SQL
Output from executing diabetes data SQL

Now create a new “Query Tab”. (File -> New Query Tab) or COMMAND-T then try the query:

SELECT * FROM patients;

The result grid pane should show something similar to the following.

"SELECT * FROM patients" Output
"SELECT * FROM patients" Output

Kaggle Seizure Prediction Competition

Kaggle Competition Goal

Detect seizure (preictal) or non-seizure (interical) segments of intracranial electroencephalography (iEEG) data. See Kaggle EEG Competition page for more details.

My Approach:

  • Extract basic stats and FFT features for non-overlapping 30-second iEEG windows
  • Detect signal drop out and impute missing data with mean for each feature per window
  • Predict seizure and non-seizure segments using a stacked model.

Model Details

For more details about the model,  see my github repo with the documentation and R code.

Final Thoughts

Completed Coursera Data Science Specialization

I completed the 10-course data science specialization by Johns Hopkins University on Coursera.

Here are my certificates:
https://www.coursera.org/account/accomplishments/specialization/JKSAW82GLH35

Links

Retrospective

I enjoyed the course. This course took me waaaay more time than I thought because I struggled with a few issues.

  • First, I wish I'd started by taking the NLP online course before starting the Capstone (https://www.youtube.com/watch?v=-aMYz1tMfPg).
  • There was an issue installing RWeka, RJava and it took me several days to work through the issues. I eventually moved to using quanteda (https://cran.r-project.org/web/packages/quanteda/vignettes/quickstart.html).
  • I also waited far too long to develop a method to test my model using a subset of the training data, so I could test whether changes to my model improved and reduced performance. It turns out that my model trained on a 25% sample performed just as well as a model trained on 100%. I should have spent more time trying different models with the 25% sampled data.

I'm thankful for the Discussion Forum and final peer review process. Both helped me learn how I can improve my model and demo application. I really appreciate the instructors for creating this specialization. I've learned a lot.

Activity Recognition of Weight Lifting Exercises Data

Course Project for Practical Machine Learning by Johns Hopkins University on Coursera

The project includes the following reports:

Qualitative Activity Recognition of Weight Lifting Exercises Data : In this project, we use R to build a classifier using the sensor data. The data consists of training set containing over 19000 samples, each with 152 variables and classe outcome variable with the value ‘A’, ‘B’, ‘C’, ‘D’ or ‘E’. The testing set consists of 20 samples without the classe outcome variable. The goal is to build a classifier using the training data to predict the classe of the testing data.

Data
The project uses sensor data collected by Groupware@LES