Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I would describe data collection/mangling/processing as more annoying and tedious than "hard". I'd seen the Quantopian github account before, and appreciate what ya'll have done. I didn't know there was a business model behind it, but I get it now.

You said it solves "the hard parts of algo writing". I guess that's true if speaking of algos in a general sense, it makes it easier to get started. But if the goal is to write profitable algorithms, then its much less true. I just looked again at the API, and using it to create a profitable strategy would actually be more difficult (and less profitable) than it would be to trade elsewhere.

The main reason is because the function to make a trade will only place market orders. Its well known (in the financial literature) that an algorithm which trades with limit orders will almost always outperform one which trades with market orders (pretty much obviously true since the limit order is the better price). That's because where a passive trading (market making) algorithm is earning the spread, with market orders you have to beat the spread just to break even.

Of course, that's why many brokers only allow market orders (they collect the spread as their trading fee).

Also, I don't see how the backtester avoids look-ahead bias. Backtesting like that just encourages Data Dredging[1] an algorithm overfitted to the test set. It would be better to backtest on random subsets of the data (cross-validation and all that). But I can understand why that isn't done (would be harder for users to create should-be-profitable algorithms and start trading with them).

1. http://en.wikipedia.org/wiki/Data_dredging



The main reason is because the function to make a trade will only place market orders.

We (I work for Quantopian) will surely support other order types when we support live trading.

Also, I don't see how the backtester avoids look-ahead bias.

We provide over ten years of historical minute bar data for U.S. equities, with no survivorship bias. This means two things:

1. The amount of data we provide is sufficiently large that if you test your algorithm against a bunch of stocks over that entire period of time and it performs reasonably well, it's unlikely that it's overfitted to the data in a way that is going to bite you on the ass in live trading.

2. But just to be even more paranoid, the smart algorithm writer will do just what you describe -- divide the available data into lots of subsets, randomly pick which subsets of the data to test again each time you backtest, and don't start live trading an algorithm until you've confirmed that it performs well on random subsets of data that you haven't previously tested it on. Right now on Quantopian you'd have to do all that data segmentation and selection by hand, but I suspect that we will eventually add features to make it easier to do automatically.


> We (I work for Quantopian) will surely support other order types when we support live trading.

I suspect they won't be true limit orders, but stop orders (true limit orders wouldn't slip). Also, for a passive trading algorithm its important to have level II data (the order book), or the algorithm is flying blind.

Re 1, its not the size of the data set which prevents overfitting (larger data sets actually make it more likely), but the use of it when developing/training the algorithm. Repeated testing and tweaking the algorithm will overfit, unless one is careful to maintain the complexity of the algorithm.

Anyway, best of luck to your users! But they should be warned that active trading on signals (price prediction) is the hard way to algorithmic profit. That's why market making and arbitrage is the bread and butter of the pros, not signal trading.[1]

1. said as much here somewhere in the first 20:00 http://www.youtube.com/watch?v=hKcOkWzj0_s


In that video the quotes to which I guess you are referring are "[investing] is almost entirely a non-professional role" (08:40), "[Market-making] is a highly professionalized role" (09:13), and "[an arbitrageur is a] highly professional role" (11:22).

I thought the video had something in support of your "price prediction is the hard way to algorithmic profit" statement, but I can't find anything very explicitly in support of that (what I thought was your implication). He kind of implies something like that at 15:00 when he says "we have no idea how [price prediction] works [at longer then a few days in the future", but that's not really very strong.

When you said "active trading on signals [...] is the hard way", did you mean active trading at minute-to-day or greater holding periods? That seems a) right :); but b) slightly at odds with your "it's important to have level II data", since I would have thought that is less important at a hourly-to-day or greater holding period.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: