Building an Investment Platform

Date: 2016-06-26

Today I wanted to share an introduction to some of the major components and technologies I built for my investment platform running on AWS.

Summary

This investment platform automates profitable trading strategies through custom algorithms.

It was built to hunt for custom, targeted investment opportunities in real-time and required me to build numerous distributed components like: a real-time pricing engine, a data-mining news and event pipeline, a custom search engine (not Elastic), an ETF option price chains detection system, custom technical and fundamental indicators, a backtesting engine (it’s even mobile-ready), a forwardtesting price projection system, an algorithmic genetic engine, and automatic retesting of profitable accuracy through data science + statistics that run every night.

The programmable stack is written entirely in Python 2 and holds the bulk of my programming efforts to date.

Below are some of the products, components and systems I had to build for my investment platform.

eNewsPulse

_images/image_enewspulse.jpg

Around mid June of 2015, I launched a derivative company with a licensing agreement to use the search engine and real-time email notification system (AWS SES) from the original platform in the hopes of finding a paying user base using a subscription B2C business model. By August of 2015, the product netted a customer over $30,000 in one day on one trade. Looking back this product was too niche and worked best when used with small and micro Biotechs receiving FDA or drug-related news.

Here is the website: https://enewspulse.com/home/

A Real-Time Distributed News Pipeline

The original purpose for the platform was to data mine the major news sites in real-time and apply keyword matching and logical filters to find and notify us when it found relevant news. I often compare it to sifting through a water stream and extracting small nuggets of gold in real-time.

A Real-Time Distributed Pricing Engine

I have an intra-day pricing and compression engine for targeted ETFs and the S&P 500. Right now the system updates every minute, compresses on configurable time-units and initiates the algorithm trading processors to test if there is a profitable opportunity.

A Real-Time Distributed Notification System

From day one the platform was responsible for reliably alerting my team in under 60 seconds when it found a profitable trade over email. Because normal email is laggy (sometimes adding 20+ seconds for large customer batch emails), I plan to utilize other notification tools that can push us under the 10-second barrier.

The ETF Price Prediction Playbook

Every night, my team receives a Playbook email outlining probable pricing and directional confidence for our targeted ETFs. Through the use of back and forward testing of numerous technical indicators (including ones we built), it can assess confidence and direction of an ETF over multiple time horizons (for now the maximum is 30 days into the future).

A Backtesting Engine

I built a web-accessible backtesting engine that allows for optimizing and tuning 1000s of technical and fundamental indicators based off time-agnostic candlestick data to automatically find profitable configurations. This system uses a highlander there can only be one model where only the best performing configuration is recorded and added to the real-time pricing algorithm engines. This also supports automatic, randomized sample testing so I can allocate a budget per ETF/ticker, per indicator and per time horizon.

A Forwardtesting Engine

While looking into the past is a great starting point, I also simulate positive/negative/horizontal pricing scenarios to assess the current probabilities of the current algorithms’ predictions against different, hypothetical futures.

The Screener System

One of the most recent additions is a system for tracking the performance of 1000s of screeners using fundamental and technical indicators across diverse industries and sectors. This system is integrated into Slack via a bot for finding the best tickers and screens of the week or from the archive.

ETF Option Price Calculator

Through the use of data science, I have tuned my original Black Scholes option calculator to function during intra-day market conditions. Every night it auto-adjusts to changes in volatility, macro-level market conditions and historical accuracy tolerances. I find it a great way to assess option spread profitablity.

Option Spread Profitability

Once the option calculator worked, I built a web-based option spread calculator that we use for sanity-checking spreads prior to execution. Since I do not have a paid subscription to an option chain pricing data feed, I have not built this into a pipeline like the news one just yet.

SEC Integration

For now the platform is integrated with the SEC to ensure the symbols and companies are traded on the US Exchanges.

An Algorithmic Genetic Engine

My first attempt at predicting profitable trades led me down the path of building my own genetic engine for automatic backtesting of investment algorithms. It sampled from a genetic pool of technical indicators and randomly configured the indicators’ alleles. Once enough indicators were randomly created they were combined into algorithm chains (dna lifeforms). These chains were then graded for successful buy/sell predictions through backtesting against each other (a population).

By using natural selection culling to take only the top-performing 20% across 100s of survival-based generations (including possible randomized-mutation events), I was eventually rewarded with a set of technical indicator algorithms that worked for a couple years on AMZN. In hindsight this system is amazingly accurate but requires a dedicated budget to find profitable algorithms by itself.

I think of this kind of system as hot spot success detection. If there was a successful lifeform, it tended to circle around that spot more and more with diminishing returns. Just a final food for thought, building your own will take a long time and requires a good deal of capital to be profitable. It’s like mining for bitcoins in a sea of candlestick data.

So what’s next?

I wrote this blog in the hopes that it will help others looking to build similar technologies. In the coming posts, I plan to talk about our data science approach and look forward to hearing from you.

Thanks for reading,

Jay

Want to talk more?