Friday, January 10, 2020

Data Science - Machine Learning Datasets for the public

This post is a collection publicly available datasets for those looking for data to do Machine Learning.

Kaggle
https://www.kaggle.com/

UCI Machine Learning Dataset
https://archive.ics.uci.edu/ml/datasets.php

Openml.org

Melbourne City Council
Open Data Platform
https://data.melbourne.vic.gov.au/
Centre of Land Use and Employment
https://data.melbourne.vic.gov.au/clue
https://www.melbourne.vic.gov.au/about-melbourne/research-and-statistics/city-economy/census-land-use-employment/Pages/clue-data-reports.aspx


Australian Bureau of Statistics - may need to search a bit
https://www.abs.gov.au
or a specific example:
https://www.abs.gov.au/ausstats/abs@.nsf/Latestproducts/5512.0Main%20Features42017-18?opendocument&tabname=Summary&prodno=5512.0&issue=2017-18&num=&view=

The U.N.
https://www.un.org/en/databases/
a specific example:
https://unstats.un.org/unsd/demographic/products/default.htm


Stock Trading Data
https://www.worldtradingdata.com/
https://finance.yahoo.com/
http://www.eoddata.com/register.aspx
10 New Ways to Download Historical Stock Quotes for Free
https://www.quantshare.com/sa-620-10-new-ways-to-download-historical-stock-quotes-for-free
Specifically, Technical Analysis indicators are available in the Python library TA-Lib. A description of how this can be used is found in:
https://towardsdatascience.com/trading-strategy-technical-analysis-with-python-ta-lib-3ce9d6ce5614

Thursday, January 09, 2020

Chess Software

This post came about while looking for programs that hold historical chess games and software to use or rather, replay these historical games.

The first step was searching for "free chess database" and found below.
https://www.chess.com/forum/view/general/world-largest-offline-chess-database-for-free

On that site, codekiddy has collected over 30million games over 500 years.
So how do we use his database? He says:
"I've been using Scid vs PC program to manage database and therefore the database is in scid format, meaning you can download it and do with it what ever you want, such as importing more games."

So the second search is for SCID which led to:
https://en.wikipedia.org/wiki/Shane%27s_Chess_Information_Database
http://scid.sourceforge.net/
That is actually a software that can potentially use codekiddy's 500 year cess database.
But wait, codekiddy actually says "Scid vs PC" as the name of the program. This program also appears in the wiki page reference, leading next to.....

... the third item "Scid vs PC".....
http://scidvspc.sourceforge.net/

Now to try out these stuff......

Wednesday, January 01, 2020

Cloud Compute Free Tiers - AWS, Azure, Google GCP, AliCloud, Oracle

This is a quick note on which is the best (or most preferable) Cloud Compute resource that I can get for free. The best means 'best for me' so I will not hesitate to declare a clear winner in this comparison - because this is NOT a recommendation to the public.

What is important when selecting a "FREE TIER" in Cloud Compute resources for 'me':
- Long time frame - anything less than 12 months I would not waste time considering.
- Ease of sign-up - eg don't ask for your credit card
- Amount of Compute Resources. Yes I put this at 3rd rather than 1st place, because I expect most providers to give very tiny amounts of resources anyway.

I also provide the direct link to the exact specifcations, which often is not very obvious and hidden away from the main marketing page.

The details below are at the time of writing Jan 2020.

Google Cloud Platform (GCP) offers the f1.micro instance.
https://cloud.google.com/compute/docs/machine-types
For 12 months
"Micro machine type with 0.2 vCPU and 0.6 GB of memory, backed by a shared physical core."
Not even 1 whole core.

Alibabacloud offers the t5-lc1m1.small
https://www.alibabacloud.com/product/ecs-t5
12 months
1vcpu, 10% baseline performance, 1GiB

Microsoft Azure offers the Standard B1s
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/b-series-burstable
12 months
1vcpu, 1GiB, 10% baseline performance,

Amazon AWS offers the t3.micro instance
https://aws.amazon.com/ec2/instance-types/t3/
12 months
2 vcpus, 1.0GiB, 10% baseline performance
The oldest cloud player here, yet more generous than the above so far.....

Oracle Cloud offers ??? instance
https://www.oracle.com/au/cloud/compute/pricing.html
ALWAYS Free
The specs list in the link, cannot be used because it is not clear which VM instance is provided.
But on the main page, it says:
Databases 2 databases total, each with 1 OCPU and 20 GB storage.
Compute: 2 virtual machines with 1/8 OCPU and 1 GB memory each.
Storage 2 Block Volumes, 100 GB total. 10 GB Object Storage. 10 GB Archive Storage.
But on the link above, the smallest VM is:
VM.Standard.E2.1 with 1OCPU, 8GB mem, 1PB block volumes.

And the WINNER is (clearly) - Oracle Cloud. It is Always Free and offer 2VMs.

That was a terrible experience! Tried registering for the free cloud tier, and used my credit card. But the application got Rejected, because the Credit Card I used was a prepaid VISA credit card. Even though it is supposed to be Free - they still want to keep your main credit card number, on servers that can never be guaranteed to be secured.

Sorry - No winners now..... still looking for a decent free tier cloud.

Disclaimer: Again, this is my personal note and opinion. This is NOT any kind of recommendation.