What are statistics?
Statistics are transformations of data through calculation. While in theory any calculation done with data yields a statistic, in practice statistics are associated with scientific and field specific calculations. For example, during preparation phase your data is frequently explored and analysed using simple plotting techniques that present meaningful descriptive statistics from the data or data subset, like the mean, median, and/or quartile ranges. As another example, in macroeconomics, various calculations with data yield well known statistics like GDP, inflation, or trade balance by sector. These field specific statistics are applied in other fields, like investment or policy analysis through applications of decision science.
How are statistics like mean and median useful?
Data science is intrinsically intertwined with statistics. After a data set is collected, for example, simple statistics are used to ensure the data set matches the parameters and constraints given in the data plan. These same simple statistics, along with more complex ones, enable statistical inference of characteristics in underlying populations. When we operate on data to derive meaningful outcomes from our analysis, we employ statistical and probabilistic models. Sometimes this is done stochastically, through repetition of experimental trials, sometimes statically, with models that are historically calibrated and run through forward scenarios with simulated shocks. Both simple and complex statistics allow us to conclude information about cause and effect between predictive and predicted variables. Theories about causation may be proved or disproved, and models themselves may be proven more or less valid.
Are there practical examples?
Granted many statistical or probabilistic models, like those that describe the results of sequences of experimental trials, like coin tosses, seem theoretical and abstract, there are in fact many very practical applications. We can envision many of these by simply listing random variables. A random variable is a function of the outcome of an experiment of chance. For example, the outcome of a toss of a die is a random variable, as is the future price of a stock at a given time, or the number of hits to a website after its URL appears in a newspaper. In fact, any event in the future can be expressed as a random variable and assigned an expected outcome. This is especially important in industries that rely on pricing forward risk, like investment and insurance, and those that deal with the critical utility of the population, like health sciences, but it is in fact important everywhere within the production possibility frontier. Manufacturing, for example, relies on statistical and probabilistic inference to ensure products will perform within given safety parameters. In transportation, statistics and probability determine scheduling and routing. So yes, statistics are practical.