Market Data and Feeds
Every market quote you see—whether on your broker's platform, a financial news site, or a professional trading terminal—originates from a market data feed. These feeds transmit thousands of prices per second across the globe, enabling traders to make decisions, algorithms to execute strategies, and markets to function. Market data is not simply information; it's a valuable, separately charged commodity that exchanges actively market and monetize. A bank trading equities might spend $500,000 annually on market data subscriptions. A retail investor using free, delayed quotes doesn't pay directly for data but sacrifices freshness and accuracy. Understanding market data feeds—what they contain, how they're priced, where to access them, and what latency means—is fundamental for anyone seriously engaging with financial markets.
Quick definition
Market data feeds are real-time or near-real-time streams of information from exchanges transmitting prices, volumes, and trading activity. Feeds come in multiple tiers—Level 1 (best bid/ask), Level 2 (order book depth), Level 3 (execution details)—and are provided by exchanges, brokers, or third-party vendors at various price points. Latency (the delay between when a trade executes and when data is transmitted) is critical; professional traders may pay premium prices for microsecond-latency feeds while retail traders accept delays of seconds or minutes.
Key takeaways
- Market data is a separate commodity from trading itself; exchanges license and charge for data access independent of trading fees
- Real-time versus delayed data: Real-time quotes cost $20-100+ per month per exchange; delayed data (15-20 minute delay) is often free
- Level 1 data (best bid/ask) is the minimal requirement; Level 2 data (full order book) is essential for serious traders
- Co-located data feeds and microsecond-latency services cost thousands per month but provide nanosecond advantages to high-frequency traders
- Exchange data is licensed, proprietary: Exchanges control data distribution and actively enforce licensing agreements
- Alternative data providers (Bloomberg, Reuters, Polygon, Alpaca) aggregate exchange data and resell at different price points
- Data validation and cleaning is critical; exchange feeds contain errors, dropped packets, and duplicate messages that must be handled
- Latency and bandwidth create trade-offs; real-time data requires high bandwidth and introduces challenges for long-distance transmission
The Structure of Market Data: Tiers and Levels
Market data comes in standardized tiers, each providing progressively more detail:
Level 1 Data (Best Bid/Ask): The most basic and publicly available tier. Level 1 shows:
- The best (highest) bid price and bid size
- The best (lowest) ask price and ask size
- Last trade price and volume
- Daily high, low, open, and close prices
Level 1 data is often freely available on trading platforms, financial news sites, and broker platforms because it's the minimum information a trader needs to make basic buy/sell decisions. Most retail investors operating on free platforms see only Level 1 data.
Level 2 Data (Order Book Depth): Shows multiple levels of bids and asks:
- The 5, 10, or 20 best bid prices and sizes
- The 5, 10, or 20 best ask prices and sizes
- Market maker identities (typically shown as market maker codes, e.g., "SCHD" for Schwab)
- Time of day for each order
Level 2 is essential for:
- Scalpers and day traders who need to understand order book structure to identify support/resistance and predict short-term price moves
- Algorithmic traders who use order book imbalances as signals
- Traders managing large orders who need to understand where liquidity is available
Level 2 typically costs $15-50 per month per exchange on retail platforms and $50-500 per month for institutional-grade feeds.
Level 3 Data (Execution Details / Raw Trade Data): The most granular level:
- Every trade executed (not just the best bid/ask or depth)
- Timestamp of execution (often to microseconds or nanoseconds)
- Trade size and price
- Participant identifiers
Level 3 and raw trade feeds are expensive ($500+ per month per exchange) because they provide the highest-resolution information about market activity. High-frequency traders, market makers, and researchers use these feeds to analyze market microstructure and infer information from order flow patterns.
Multicast/Co-located Data: Some exchanges offer data directly from their systems at the highest speeds (microsecond latency) through direct connections. This data is available only to firms willing to pay for co-location at the exchange's data center. Costs include:
- Co-location fees: $1,000-5,000+ per month
- Direct feed connectivity: $1,000+ per month
- Implementation and maintenance: significant engineering resources
For high-frequency traders operating on microsecond time scales, the latency advantage justifies these costs. A 1-microsecond advantage in seeing market data translates to a measurable edge in certain trading strategies.
Real-Time versus Delayed Data
Real-time data shows prices as they execute, with minimal delay (typically less than 100 milliseconds for retail-grade systems). For professional traders, real-time data is essential because prices can move significantly in seconds. A trader holding overnight positions or monitoring earnings announcements needs real-time data to react quickly.
Delayed data (typically 15-20 minutes for major US exchanges) is free or low-cost but reflects the historical past. A trader using 20-minute delayed data is looking at prices that executed 20 minutes ago. For a stock trading $100, a $0.50 move per minute could result in a $10 price discrepancy between real-time and delayed data. This is significant enough to impair decision-making.
However, delayed data is sufficient for:
- Long-term investors making strategic allocation decisions (who don't care if they buy at 9:31 AM versus 9:51 AM)
- Tracking general market direction and sentiment
- Conducting historical analysis and backtesting
The trade-off is explicit: real-time data costs money; delayed data is free or cheap. This creates a tiered market where:
- Retail investors using free platforms see delayed data and operate at an information disadvantage
- Semi-professional traders pay $20-50 per month per exchange for real-time data
- Professional traders and institutions pay $500+ per month per exchange
Consolidated feeds versus exchange-native feeds: Many data providers offer "consolidated" feeds that aggregate Level 1 data from multiple exchanges. For US equities, the consolidated feed (the "National Best Bid Offer" or NBBO) shows the best bid and ask across all US exchanges. This consolidated feed is available from multiple vendors, creating competition and lower pricing. However, exchange-native feeds (data directly from a single exchange) are proprietary and exchange-controlled, with less competition and higher prices.
Market Data Standards and SIP (Securities Information Processor)
In the US equity market, the Securities and Exchange Commission (SEC) requires that certain market data be processed and disseminated through official Securities Information Processors (SIPs) to ensure that all investors have access to best bid/offer information simultaneously.
The SIP system consolidates data from all US stock exchanges (NYSE, Nasdaq, CBOE, etc.) and disseminates the consolidated NBBO (National Best Bid Offer) and last-sale information to all market participants in real-time. This is mandated by securities law (Reg SHO, Rule 10b-1) to prevent any trader from having a structural information advantage.
However, the SIP system has latency of approximately 50-100 milliseconds, which is acceptable for retail purposes but is considered slow by high-frequency traders. This is why HFTs run parallel direct feeds from each exchange's systems—getting data faster than the official SIP by running multiple independent connections.
The cost of the SIP: Access to official SIP data is available to all brokers and traders, and the cost is subsidized by the SEC and exchanges (it's built into exchange fees). However, the cost of distributing SIP data to end users can vary. Some brokers include SIP-based real-time data with their trading platforms; others charge for it.
Alternative Data Providers and Aggregation
Beyond exchanges themselves, several data aggregators and alternative providers have emerged, offering market data at different price points and with different coverage.
Bloomberg Terminal is the gold standard for institutional market data. A Bloomberg subscription includes:
- Real-time market data from all major exchanges worldwide
- All levels of market depth
- Historical data and analysis tools
- News, research, and analytics
Cost: $20,000+ per year per terminal. Thousands of institutions maintain Bloomberg subscriptions as their primary market data source.
Reuters (Refinitiv) provides similar functionality to Bloomberg and is used by many institutions, particularly those in Europe. Pricing is comparable to Bloomberg.
Polygon.io aggregates US equities and crypto data from multiple sources and offers a tiered subscription model:
- Starter: $100-200 per month (delayed/limited real-time)
- Professional: $500+ per month (full real-time)
Polygon has become popular with retail traders and small quantitative firms because it's more affordable than traditional vendors while offering reasonable data quality.
Alpaca Data offers market data through the Alpaca platform (which also provides brokerage services). Alpaca licenses data from exchanges and resells it, offering various tiers. The advantage is integration: traders can place trades and access data through the same platform.
Finnhub, Alpha Vantage, and other free/low-cost providers offer delayed or basic data free or at minimal cost, suitable for educational purposes and backtesting, but not professional trading.
The consolidation trend: As exchanges have consolidated into larger groups (ICE, CME, Deutsche Börse), they've also consolidated data sales. ICE (which owns NYSE, LIFFE, Euronext) can bundle European, US, and derivatives data into packages. This bundling reduces fragmentation but also increases pricing power for consolidated entities.
Latency: Why Microseconds Matter
For high-frequency traders and market makers, latency—the delay between when a trade executes and when you receive information about it—is crucial. Different latency profiles enable different strategies:
100+ millisecond latency (retail-grade platforms): Acceptable for day traders and swing traders. A trader can see a price move and decide to buy or sell within a few seconds. Most retail platforms have this level of latency.
10-50 millisecond latency (professional platforms): Used by professional traders and institutions. Algorithms can react to price moves within tens of milliseconds, enabling strategies like momentum capture and arbitrage across multiple venues.
1-10 millisecond latency (co-located, professional high-frequency): Used by high-frequency traders running servers at exchange data centers. Latency advantages enable microsecond-scale arbitrage and market-making strategies.
Under 1 microsecond: The absolute frontier of HFT. Some firms run FPGA (field-programmable gate array) systems that process market data at the nanosecond scale, enabling strategies based on detecting order flow patterns before most traders can even see them.
The relationship between latency and strategy is direct: if you have a 1-microsecond latency advantage over other traders, you can execute certain strategies (providing liquidity for microsecond timeframes) that other traders cannot. Conversely, if your latency is worse, those strategies are unavailable to you.
Latency arbitrage example: Suppose a stock trades on both the NYSE and the Nasdaq. A trader with direct feeds from both exchanges can see when a price discrepancy emerges between the two venues. If NYSE shows a trade at $100.00 and Nasdaq shows $100.05, a low-latency trader can immediately buy on NYSE and sell on Nasdaq. A trader with 100-millisecond latency might not see the opportunity; by the time data arrives, the arbitrage window may have closed.
Market Data Challenges and Data Quality
Real-world market data has imperfections that traders must handle:
Dropped packets: Network connections are not perfect. Occasionally, a data packet fails to transmit, and a trader briefly loses information about the latest bid/ask. Professional systems must detect missing data and request retransmission.
Out-of-order messages: In distributed systems, messages can arrive out of sequence. A trade executed at 9:31:00.000 might arrive after a trade executed at 9:31:00.001. Systems must handle this reordering to maintain accurate market state.
Duplicate messages: Exchange systems occasionally transmit the same message twice. Traders must implement duplicate detection and filtering.
Stale data: Some exchanges experience brief outages where they stop publishing data. Traders must detect staleness (no updates for an unusual time period) and alert operators.
Symbol/identifier mismatches: Different exchanges use different identifiers. NYSE uses stock symbols like "AAPL"; some systems require RIC codes (Reuters Instrument Codes) like "AAPL.N". Traders must maintain mapping tables to ensure they're tracking the right security.
Decimalization and rounding errors: Some international exchanges quote in increments of 0.0001; US exchanges typically in 0.01. Precision mismatches can cause subtle calculation errors.
For retail traders using broker platforms, the broker typically handles these data quality issues. For institutional traders and quantitative firms, teams of engineers maintain systems to detect and correct data quality problems.
Real-World Examples
Example 1: A Day Trader's Data Setup
A day trader might have:
- Broker platform with Level 1 data: Free (subsidized by broker)
- Direct Level 2 feed from Nasdaq: $50/month
- Level 2 feed from NYSE: $50/month
- Direct Level 2 feed from CBOE for options: $30/month
- Polygon.io data for research: $200/month
Total cost: approximately $330/month, or $4,000 per year.
For a trader generating $100,000 in annual income from trading, this represents 4% of income spent on data—reasonable, though not trivial. The data feeds justify their cost by enabling the trader to identify trading opportunities and execute with minimal latency.
Example 2: A Retail Investor's Data Approach
A retail investor might:
- Use free Level 1 data from broker
- Check financial news sites (Yahoo Finance, CNBC) for updates
- Check price once or twice per day
Cost: $0. The investor accepts 15-20 minute delayed data because they're not actively trading intraday; they're making weekly or monthly rebalancing decisions based on fundamental analysis, not real-time price action.
Example 3: A Quantitative Hedge Fund's Data Infrastructure
A small quant fund might have:
- Co-location at NYSE and Nasdaq data centers: $2,000/month each = $4,000
- Direct high-speed feeds from NYSE, Nasdaq, CBOE: $1,500/month combined
- Bloomberg Terminal (2 seats): $40,000/year
- Polygon and other alternative providers for research: $2,000/year
- In-house data engineering infrastructure (amortized): $1,000+/month
Total: approximately $40,000-50,000 per year plus significant engineering overhead. For a fund managing $100 million, this is approximately 0.04-0.05% of AUM—negligible but necessary overhead.
Common Mistakes
Mistake 1: Assuming Free Data Is Good Enough for Active Trading
A trader operating with delayed data from a free source loses the ability to react to rapid price moves. Even a 15-minute delay causes missed opportunities on moving stocks. Active traders who use free data typically underperform compared to traders with real-time data. The cost of real-time data is easily recouped through improved execution.
Mistake 2: Not Understanding the Licens limitations on Market Data
Many brokers include real-time data with trading accounts, but with restrictions. For example, data might be "for personal use only" and not permitted for commercial purposes or sharing with others. A trader posting alerts or signals to subscribers might violate their data license. Understanding data licensing terms is critical to staying compliant.
Mistake 3: Ignoring Data Latency in Backtesting
When backtesting a trading strategy, traders sometimes assume they can execute at the last traded price (using real-time data). In reality, execution occurs at the ask price when buying and the bid price when selling. Not accounting for bid-ask spreads and execution delays leads to optimistic backtesting results that don't reflect real trading.
Mistake 4: Trusting Data Without Validation
Exchange data occasionally has errors: duplicates, mismatched prices, stale quotes, etc. A trader who blindly trusts data might make decisions based on bad information. Implementing automated checks for data quality (e.g., detecting unusually large price moves that might indicate bad data) is essential.
Mistake 5: Underestimating the Cost of Multi-Exchange Data
A trader might budget for data from one exchange but later realize they need to monitor multiple exchanges. Costs multiply across exchanges. A data budget of $50/month for one exchange becomes $300/month for six exchanges. Traders should estimate all required data feeds upfront.
FAQ
Q: Is Level 1 data sufficient for long-term investing?
A: Yes. Long-term investors don't need real-time order book depth. Daily closing prices, monthly prices, or even quarterly performance is sufficient for making allocation decisions. The trade-off between cost and information value heavily favors cheaper/delayed data for long-term investors.
Q: Why do exchanges charge separately for market data if I'm already paying trading fees?
A: Exchanges argue that market data is a separate product from trading execution. The NYSE publishes market data (quotes, trades) that benefits all investors; they should pay for the value. Additionally, market data licensing generates significant revenue for exchanges (often 10-20% of total revenue), and they have market power to charge for it. Regulators in some countries have pushed for lower market data costs, but US regulatory approach has allowed exchanges to charge independently.
Q: Can I use delayed data for algorithmic trading?
A: Delayed data is useless for algorithmic trading if the algorithm is meant to react to intraday price moves. However, algorithms designed for longer timeframes (daily or weekly) can use delayed or even historical data. Many quantitative funds use end-of-day data for portfolio rebalancing and strategic allocation decisions.
Q: How do I verify the quality of market data I'm receiving?
A: Comparison across sources is the primary method. If you're receiving data from your broker and also from a third-party feed, comparing the two can reveal discrepancies. Additionally, cross-exchange consistency checks (e.g., ensuring that the NBBO is never violated) can identify stale or bad data.
Q: Is Bloomberg the only professional market data source?
A: No, though it's the most popular. Reuters (Refinitiv), FactSet, and other vendors provide professional-grade data. However, Bloomberg has the largest market share among institutions due to its comprehensive ecosystem (data, analysis, communication, trading). For specific purposes, specialized vendors (like Polygon for equities or COMEX for commodities) may be better value.
Q: Can I build my own market data system using exchange APIs?
A: Yes, but it requires substantial engineering effort. Some exchanges offer APIs for accessing market data; your engineers can build systems to consume these APIs and aggregate data. However, for most traders and firms, using existing vendors is more cost-effective than building in-house (which requires hiring engineers and maintaining infrastructure).
Q: How is latency measured in market data?
A: Latency is measured from the timestamp of an event (a trade execution) to when an observer receives and can act upon the information. Timestamps on trades are set by exchange systems; latency is measured between that timestamp and when an external system receives the message. For retail platforms, this might be 100-200 milliseconds; for co-located systems at an exchange, it might be under 1 millisecond.
Related Concepts
- Market Microstructure: The analysis of how market data, order books, and price discovery work at fine timescales
- High-Frequency Trading and Latency Arbitrage: HFT strategies depend critically on low-latency data feeds and the ability to react faster than other market participants
- Alternative Data and Non-Traditional Signals: Beyond market data from exchanges, traders use alternative data (satellite imagery, credit card transactions, etc.) for trading signals
- Data Compression and Bandwidth: Real-time market data can consume substantial bandwidth; traders optimize data pipelines through compression and filtering
- Regulatory Reporting and Market Transparency: Exchanges are required by regulators to publish market data; this transparency requirement is separate from market data licensing
Summary
Market data feeds are the lifeblood of financial markets, transmitting thousands of prices per second globally and enabling trading decisions. Tiers range from basic Level 1 (best bid/ask, free on many platforms) to granular Level 3 (trade-by-trade executions, expensive and specialized). Real-time data costs money ($20-500+ per month depending on tier and exchange) while delayed data (15-20 minute delay) is often free.
For retail investors, free or delayed data is appropriate; for active traders, real-time Level 2 data from key exchanges justifies the cost. For high-frequency traders, microsecond-latency co-located feeds at data centers are essential. Understanding what data you need, accepting the appropriate latency tradeoff, and validating data quality are critical skills for traders at all levels. The cost of good data is negligible compared to the losses from poor data or missed opportunities.