Probabilite+statistique: Introduction

Chapter 2: Statistical Tables and Graphical Representations

I. Introduction Statistical tables are a great starting place for summarizing and organizing data. Once have a set of data, one may first want to organize it to see the frequency, or how often each value occurs in the set. Statistical tables can be used to show either quantitative or categorical data.

Graphical representations are tools that help learn about the distribution, or shape of a sample or a population. A graph can be a more effective way of presenting data than a mass of numbers.

II. Statistical Table: In statistics, tables are very useful in presenting data in a structured manner and are more legible. We can distinguish three types of tables: table of data (or elementary table), frequency distribution table (called also table of counting) and the table of relative frequencies distribution.

II.1 Data Table: The raw representation of data is not readable. The information will be more readable whenever they are grouped in a table of data. This is the reason why, in any classic statistical approach, data tables are the first to be drawn up. These are the tables that facilitate and report on the processing of data. Exp: using an Excel file.

Every table is made up of rows and columns. To construct our data table we must therefore draw rows and columns. The columns list the characters studied and the rows correspond to the individuals observed.

	Column 1 = Variable 1	Column 2 = Variable 2
Line 1 = Individual 1
Line 2 = Individual 2		Case = Modality or Variable

Example: Recall the illustrative example of the first chapter on the statistical study conducted on First year students at the departement of Mathematics by the teacher who asked his students to provide responses on:

Ø The color of their eyses.

Ø Their behaviour towrds morning coffee.

Ø The number of sisters and brothers they have.

Ø Their heights in cm.

Responses provided by the students are the data that will be studied. These data are given by in the four statistical series corresponding to the four character respectively:

: Black, Blue, Blue, Black, Brown, Blue, Black, Blue, Green, Brown, Brown, Green, Brown, Brown, Brown, Black, Blue, Black, Brown, Green.

: Somtimes, Often, Somtimes, Always, Often, Always, Often, Always, Somtimes, Always, Often, Somtimes, Somtimes, Never, Often, Never, Somtimes, Always, Never, Somtimes.

: 4 3 5 6 1 3 7 4 5 4 2 2 3 3 2 5 3 3 0 4

: 1.59 1.45 1.53 1.73 1.50 1.72 1.61 1.50 1.71 1.63 1.80 1.58 1.69 1.66 1.69 1.75 1.73 1.65 1.64 1.55.

The four statistical series can be organised in the following table:


student 1	Black	Somtimes	4	1.59
student 2	Blue	Often	3	1.45
student 3	Blue	Somtimes	5	1.53
student 4	Black	Always	6	1.73
student 5	Brown	Often	1	1.50
student 6	Blue	Always	3	1.72
student 7	Black	Often	7	1.61
student 8	Blue	Always	4	1.50
student 9	Green	Somtimes	5	1.71
student 10	Brown	Always	4	1.63
student 11	Brown	Often	2	1.80
student 12	Green	Somtimes	2	1.58
student 13	Brown	Somtimes	3	1.69
student 14	Brown	Never	3	1.66
student 15	Brown	Often	2	1.69
student 16	Black	Never	5	1.75
student 17	Blue	Somtimes	3	1.73
student 18	Black	Always	3	1.65
student 19	Brown	Never	0	1.64
student 20	Green	Somtimes	4	1.55

II.2. Frequency distribution table: the distribution table reorganises the data in the data table and presents it in a clearer and more concise manner, without losing any of the information contained in the original statistical series. The construction of the counts table depends on the nature of the characteristic studied. It is done directly within the framework of a qualitative character or quantitative discrete one.

However, in the case of a continuous character, the construction requires passing through classes where the data are grouped into semi-open intervals, where is given by one of the two following formulas:

Sturge rule k=1+3.3 log(n)

Yule k=2.5 (n)^1/4

and The construction of the classes is done as follows:

1. We calculate the range of the statististical series 2. We determine le length of the classes such as .

The table of numbers is made up of a column presenting the list of modalities (or values, or classes) of the character studied and the other column corresponding to the number of occurrence for each modality (or value, or classe).

modalities (values or classes)	Counts (or frequency)
...	...

Note: In the same way we define the relative frequency distribution table by replacing the counts by the relative frequencies .

Example: Notice that:

ü is a qualitative nominal character of four modalitie: Black, Blue, Green and Brown.

ü is a qualitative ordinal character of four modalities: Never, Often, Somtimes, Always.

ü is a quantitative discrete character of values: 0 1 2 3 4 5 6 7.

ü is a quantitative continuous character of values ranging between 1.50m and 1.80m.

Then, for the three first statistical variableZ, we provide directly their corresponding statistical tables as follows:

Ø For The color of eyses : (nominal)


Black	5
Blue	5
Green	3
Brown	4

Ø For The behaviour towrds morning coffee . (ordinal)


Never	3
Often	7
Somtimes	5
Always	5

Ø For the number of sisters and brothers . (discrete)


0	1
1	1
2	3
3	6
4	4
5	3
6	1
7	1

Ø For the heights which is a continuous character ( measured in m):

1) Calculate the number of classes:

k=1+3.3 log(20)5

2) Calculate the range of the statististical series

3) Determine the length

4) Construction of the statistical table:

Classes
[1.45 , 1.53)	3
[1.53 , 1.61)	4
[1.61 , 1.69)	5
[1.69 , 1.77)	7
[1.77 , 1.85)	1

III. Graphical Representation: It is necessary to draw up a graphical representation in order to bring out part of the information in the data so that it becomes more and more “relevant”. Depending on the nature of the character, the method of graphic representation will be different: Pie chart (nominal), bar chart (ordinal), vertical line chart (discrete) and histogram ( continuous).

III.1. Pie Chart: A pie chart is a circular statistical graphic divided into slices to illustrate numerical proportion, where each slice represents a percentage of the total whole. It is best used for comparing parts of a single category to a total (100%), often organized from largest to smallest slice for easier interpretation.

Example: For The color of eyses which is a nominal character, the pie chart is given by:

III.2. Bar Chart: A bar chart provides a way of showing data values represented as vertical bars. It is sometimes used to show trend data, and the comparison of multiple data sets side by side.

Example: For The behaviour towrds morning coffee which is an ordinal character, the bar chart is given by:

III.3. Vertical Lines Chart: A vertical line chart is a specialized visualization primarily used to display discrete. It uses individual vertical lines (also called "stems") to represent the magnitude of a category or a specific data point .Each vertical line corresponds to a specific value on the x-axis.

Example: For the number of sisters and brothers , the bar chart is given by:

III.3. Histogram: Histogram is a graphical representation that organizes a group of data points into specified ranges (called bins). It is the most commonly used tool to visualize the distribution of a continuous dataset. In the x-axis are represented the "bins" or intervals ( lasses) and in the y-axis corresponding frequencies are represented Unlike a bar chart, the bars in a histogram usually touch each other.

Example: The continous character H, the height can be represented by the following histogram:

Modifié le: lundi 13 avril 2026, 14:57

Introduction