Formally, a stream is an ordered pair of tuples - a finite list of elements - and intervals between the tuples. In other words, it is a digitally encoded, coherent sequence of signals. In practice, you can compare data stream and a data batch like comparing a stream of water from a faucet to a bucket of water. While it is possible to analyze the bucket using normal functions (how much does it weigh?), that analysis cannot be utilized for the stream, as the stream is potentially unlimited.
Data streams are analyzed differently; you can filter the stream, looking for particular attributes, and create a new stream of this subset. These subsets, or filters, can be combined by pipelines.
Stream analytics are handy when you have a continuous source of data, for example, business application logs from a system that operates 24/7. Traditionally, these records would have been analyzed as batch operations, meaning that you would take all the logs from a specified time window and examine that batch as a whole. The analysis was typically a routine carried out weekly, or perhaps quarterly.
A practical example is a corporate financial statement. Traditionally, this has been constructed as a batch operation at the end of the fiscal year. With streaming analytics, however, it is possible to produce the financial statement in real-time and publish it immediately at the end of fiscal year.
The challenges of implementing streaming analytics are that it requires the organization to approach the issue from a new angle, and set up a distinct type of architecture. Typically the streaming analytics platform is at least two layered, consisting of storage and analytics layers, with additional layers possible for fault tolerance and scalability assurance.