Taking a database from Kaggle, in this notebook I analyze some of the Football teams from the most popular European Leagues. The database is composed by 199 columns and 7 tables.
Description of the tables:
Country
(11 rows and 2 columns): Describe the countries which the leagues belong to.
League
(11 rows and 3 columns): Describe the name of the leagues and the country they belong.
Match
(26k rows and 115 columns): Describe the different matches among the teams in their leagues. The table specifies the date of the match and the goals every team scored.
Player
(11.1k rows and 7 columns): Describe the player id, their name and features such as birth date, height and weight.
Player_Attributes
(+184k rows and 42 columns): Describes attributes of the different players such as rating, preferred foot, and potential, among others. These values are based in the FIFA attributes.
Team
(299 rows and 5 columns): Describe the teams, with their long and short names, leagues, and id from FIFA.
Team_Attributes
(1458 rows and 25 columns): Descibes teams attributes such as Play of Speed, type of Defence, Creation on Passing, and other parameters that define the playstyle of the teams.
This is a very extensive dataset with more than 11000 players, 300 teams and more than 25k matches. Most of the attributes I mention above are the ones I will be using to answer different questions.
The main goal of this analysis if the use of SQL (SQLite) language to extract analytical information to answer specific questions and provide different insights.
Technical skills used in SQLite:
Topics addressed in the analysis:
1. Information of the database
2.1 Analysis of Data Wrangling
3.2 Summary of Classes of Attributes
3.3.1 Top 5 Teams per Attribute
3.3.2 Definning Winners and Lossers
3.3.3 Ranking of Teams per Wins
3.3.4 Best Teams per Season. Summary
3.4.1 Summary of Team Attributes Analysis
3.5 Attributes vs Team Victory
3.5.1 Attributes vs Team Victory. Summary of Results
3.6 Combined Attributes vs Team Victory
3.6.1 Combined Attributes vs Team Victory. Summary of Results
# Installing ipython-sql to use SQL with Python
!conda install -yc conda-forge ipython-sql
Collecting package metadata (current_repodata.json): ...working... done Solving environment: ...working... done ## Package Plan ## environment location: C:\Users\manit\anaconda3\envs\sql_analysis added / updated specs: - ipython-sql The following packages will be downloaded: package | build ---------------------------|----------------- ca-certificates-2020.12.5 | h5b45459_0 173 KB conda-forge certifi-2020.12.5 | py39hcbf5309_0 144 KB conda-forge ipython-sql-0.3.9 |py39hde42818_1002 28 KB conda-forge openssl-1.1.1i | h8ffe710_0 5.8 MB conda-forge prettytable-2.0.0 | pyhd8ed1ab_0 22 KB conda-forge python_abi-3.9 | 1_cp39 4 KB conda-forge sqlalchemy-1.3.20 | py39h4cdbadb_0 1.8 MB conda-forge sqlparse-0.4.1 | pyh9f0ad1d_0 34 KB conda-forge ------------------------------------------------------------ Total: 8.0 MB The following NEW packages will be INSTALLED: ipython-sql conda-forge/win-64::ipython-sql-0.3.9-py39hde42818_1002 prettytable conda-forge/noarch::prettytable-2.0.0-pyhd8ed1ab_0 python_abi conda-forge/win-64::python_abi-3.9-1_cp39 sqlalchemy conda-forge/win-64::sqlalchemy-1.3.20-py39h4cdbadb_0 sqlparse conda-forge/noarch::sqlparse-0.4.1-pyh9f0ad1d_0 The following packages will be SUPERSEDED by a higher-priority channel: ca-certificates pkgs/main::ca-certificates-2020.12.8-~ --> conda-forge::ca-certificates-2020.12.5-h5b45459_0 certifi pkgs/main::certifi-2020.12.5-py39haa9~ --> conda-forge::certifi-2020.12.5-py39hcbf5309_0 openssl pkgs/main::openssl-1.1.1i-h2bbff1b_0 --> conda-forge::openssl-1.1.1i-h8ffe710_0 Downloading and Extracting Packages certifi-2020.12.5 | 144 KB | | 0% certifi-2020.12.5 | 144 KB | #1 | 11% certifi-2020.12.5 | 144 KB | ########## | 100% ca-certificates-2020 | 173 KB | | 0% ca-certificates-2020 | 173 KB | ##7 | 28% ca-certificates-2020 | 173 KB | ########## | 100% ipython-sql-0.3.9 | 28 KB | | 0% ipython-sql-0.3.9 | 28 KB | #####7 | 58% ipython-sql-0.3.9 | 28 KB | ########## | 100% openssl-1.1.1i | 5.8 MB | | 0% openssl-1.1.1i | 5.8 MB | | 0% openssl-1.1.1i | 5.8 MB | 1 | 2% openssl-1.1.1i | 5.8 MB | 4 | 4% openssl-1.1.1i | 5.8 MB | 7 | 8% openssl-1.1.1i | 5.8 MB | #3 | 14% openssl-1.1.1i | 5.8 MB | #5 | 16% openssl-1.1.1i | 5.8 MB | #8 | 19% openssl-1.1.1i | 5.8 MB | ##1 | 21% openssl-1.1.1i | 5.8 MB | ##3 | 24% openssl-1.1.1i | 5.8 MB | ##5 | 26% openssl-1.1.1i | 5.8 MB | ##9 | 30% openssl-1.1.1i | 5.8 MB | ###5 | 35% openssl-1.1.1i | 5.8 MB | #### | 41% openssl-1.1.1i | 5.8 MB | ####4 | 45% openssl-1.1.1i | 5.8 MB | ####9 | 50% openssl-1.1.1i | 5.8 MB | #####3 | 53% openssl-1.1.1i | 5.8 MB | #####6 | 57% openssl-1.1.1i | 5.8 MB | ######1 | 62% openssl-1.1.1i | 5.8 MB | ######6 | 66% openssl-1.1.1i | 5.8 MB | ######9 | 70% openssl-1.1.1i | 5.8 MB | #######5 | 75% openssl-1.1.1i | 5.8 MB | #######9 | 80% openssl-1.1.1i | 5.8 MB | ########2 | 83% openssl-1.1.1i | 5.8 MB | ########6 | 86% openssl-1.1.1i | 5.8 MB | ########8 | 89% openssl-1.1.1i | 5.8 MB | #########3 | 94% openssl-1.1.1i | 5.8 MB | ########## | 100% openssl-1.1.1i | 5.8 MB | ########## | 100% prettytable-2.0.0 | 22 KB | | 0% prettytable-2.0.0 | 22 KB | #######1 | 72% prettytable-2.0.0 | 22 KB | ########## | 100% sqlparse-0.4.1 | 34 KB | | 0% sqlparse-0.4.1 | 34 KB | ####7 | 48% sqlparse-0.4.1 | 34 KB | ########## | 100% python_abi-3.9 | 4 KB | | 0% python_abi-3.9 | 4 KB | ########## | 100% python_abi-3.9 | 4 KB | ########## | 100% sqlalchemy-1.3.20 | 1.8 MB | | 0% sqlalchemy-1.3.20 | 1.8 MB | | 1% sqlalchemy-1.3.20 | 1.8 MB | 6 | 7% sqlalchemy-1.3.20 | 1.8 MB | #4 | 14% sqlalchemy-1.3.20 | 1.8 MB | #8 | 19% sqlalchemy-1.3.20 | 1.8 MB | ##3 | 24% sqlalchemy-1.3.20 | 1.8 MB | ##7 | 28% sqlalchemy-1.3.20 | 1.8 MB | ###2 | 32% sqlalchemy-1.3.20 | 1.8 MB | ###9 | 40% sqlalchemy-1.3.20 | 1.8 MB | ####6 | 47% sqlalchemy-1.3.20 | 1.8 MB | #####1 | 52% sqlalchemy-1.3.20 | 1.8 MB | #####8 | 59% sqlalchemy-1.3.20 | 1.8 MB | ######6 | 66% sqlalchemy-1.3.20 | 1.8 MB | #######1 | 71% sqlalchemy-1.3.20 | 1.8 MB | ######## | 81% sqlalchemy-1.3.20 | 1.8 MB | ########8 | 88% sqlalchemy-1.3.20 | 1.8 MB | #########3 | 93% sqlalchemy-1.3.20 | 1.8 MB | #########7 | 98% sqlalchemy-1.3.20 | 1.8 MB | ########## | 100% Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... done
Connecting the Julyter notebook to the database file: database.sqlite
%%capture
%load_ext sql
%sql sqlite:///database.sqlite
%%sql
SELECT *
FROM sqlite_master
WHERE type='table';
* sqlite:///database.sqlite Done.
type | name | tbl_name | rootpage | sql |
---|---|---|---|---|
table | sqlite_sequence | sqlite_sequence | 4 | CREATE TABLE sqlite_sequence(name,seq) |
table | Player_Attributes | Player_Attributes | 11 | CREATE TABLE "Player_Attributes" ( `id` INTEGER PRIMARY KEY AUTOINCREMENT, `player_fifa_api_id` INTEGER, `player_api_id` INTEGER, `date` TEXT, `overall_rating` INTEGER, `potential` INTEGER, `preferred_foot` TEXT, `attacking_work_rate` TEXT, `defensive_work_rate` TEXT, `crossing` INTEGER, `finishing` INTEGER, `heading_accuracy` INTEGER, `short_passing` INTEGER, `volleys` INTEGER, `dribbling` INTEGER, `curve` INTEGER, `free_kick_accuracy` INTEGER, `long_passing` INTEGER, `ball_control` INTEGER, `acceleration` INTEGER, `sprint_speed` INTEGER, `agility` INTEGER, `reactions` INTEGER, `balance` INTEGER, `shot_power` INTEGER, `jumping` INTEGER, `stamina` INTEGER, `strength` INTEGER, `long_shots` INTEGER, `aggression` INTEGER, `interceptions` INTEGER, `positioning` INTEGER, `vision` INTEGER, `penalties` INTEGER, `marking` INTEGER, `standing_tackle` INTEGER, `sliding_tackle` INTEGER, `gk_diving` INTEGER, `gk_handling` INTEGER, `gk_kicking` INTEGER, `gk_positioning` INTEGER, `gk_reflexes` INTEGER, FOREIGN KEY(`player_fifa_api_id`) REFERENCES `Player`(`player_fifa_api_id`), FOREIGN KEY(`player_api_id`) REFERENCES `Player`(`player_api_id`) ) |
table | Player | Player | 14 | CREATE TABLE `Player` ( `id` INTEGER PRIMARY KEY AUTOINCREMENT, `player_api_id` INTEGER UNIQUE, `player_name` TEXT, `player_fifa_api_id` INTEGER UNIQUE, `birthday` TEXT, `height` INTEGER, `weight` INTEGER ) |
table | Match | Match | 18 | CREATE TABLE `Match` ( `id` INTEGER PRIMARY KEY AUTOINCREMENT, `country_id` INTEGER, `league_id` INTEGER, `season` TEXT, `stage` INTEGER, `date` TEXT, `match_api_id` INTEGER UNIQUE, `home_team_api_id` INTEGER, `away_team_api_id` INTEGER, `home_team_goal` INTEGER, `away_team_goal` INTEGER, `home_player_X1` INTEGER, `home_player_X2` INTEGER, `home_player_X3` INTEGER, `home_player_X4` INTEGER, `home_player_X5` INTEGER, `home_player_X6` INTEGER, `home_player_X7` INTEGER, `home_player_X8` INTEGER, `home_player_X9` INTEGER, `home_player_X10` INTEGER, `home_player_X11` INTEGER, `away_player_X1` INTEGER, `away_player_X2` INTEGER, `away_player_X3` INTEGER, `away_player_X4` INTEGER, `away_player_X5` INTEGER, `away_player_X6` INTEGER, `away_player_X7` INTEGER, `away_player_X8` INTEGER, `away_player_X9` INTEGER, `away_player_X10` INTEGER, `away_player_X11` INTEGER, `home_player_Y1` INTEGER, `home_player_Y2` INTEGER, `home_player_Y3` INTEGER, `home_player_Y4` INTEGER, `home_player_Y5` INTEGER, `home_player_Y6` INTEGER, `home_player_Y7` INTEGER, `home_player_Y8` INTEGER, `home_player_Y9` INTEGER, `home_player_Y10` INTEGER, `home_player_Y11` INTEGER, `away_player_Y1` INTEGER, `away_player_Y2` INTEGER, `away_player_Y3` INTEGER, `away_player_Y4` INTEGER, `away_player_Y5` INTEGER, `away_player_Y6` INTEGER, `away_player_Y7` INTEGER, `away_player_Y8` INTEGER, `away_player_Y9` INTEGER, `away_player_Y10` INTEGER, `away_player_Y11` INTEGER, `home_player_1` INTEGER, `home_player_2` INTEGER, `home_player_3` INTEGER, `home_player_4` INTEGER, `home_player_5` INTEGER, `home_player_6` INTEGER, `home_player_7` INTEGER, `home_player_8` INTEGER, `home_player_9` INTEGER, `home_player_10` INTEGER, `home_player_11` INTEGER, `away_player_1` INTEGER, `away_player_2` INTEGER, `away_player_3` INTEGER, `away_player_4` INTEGER, `away_player_5` INTEGER, `away_player_6` INTEGER, `away_player_7` INTEGER, `away_player_8` INTEGER, `away_player_9` INTEGER, `away_player_10` INTEGER, `away_player_11` INTEGER, `goal` TEXT, `shoton` TEXT, `shotoff` TEXT, `foulcommit` TEXT, `card` TEXT, `cross` TEXT, `corner` TEXT, `possession` TEXT, `B365H` NUMERIC, `B365D` NUMERIC, `B365A` NUMERIC, `BWH` NUMERIC, `BWD` NUMERIC, `BWA` NUMERIC, `IWH` NUMERIC, `IWD` NUMERIC, `IWA` NUMERIC, `LBH` NUMERIC, `LBD` NUMERIC, `LBA` NUMERIC, `PSH` NUMERIC, `PSD` NUMERIC, `PSA` NUMERIC, `WHH` NUMERIC, `WHD` NUMERIC, `WHA` NUMERIC, `SJH` NUMERIC, `SJD` NUMERIC, `SJA` NUMERIC, `VCH` NUMERIC, `VCD` NUMERIC, `VCA` NUMERIC, `GBH` NUMERIC, `GBD` NUMERIC, `GBA` NUMERIC, `BSH` NUMERIC, `BSD` NUMERIC, `BSA` NUMERIC, FOREIGN KEY(`country_id`) REFERENCES `country`(`id`), FOREIGN KEY(`league_id`) REFERENCES `League`(`id`), FOREIGN KEY(`home_team_api_id`) REFERENCES `Team`(`team_api_id`), FOREIGN KEY(`away_team_api_id`) REFERENCES `Team`(`team_api_id`), FOREIGN KEY(`home_player_1`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`home_player_2`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`home_player_3`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`home_player_4`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`home_player_5`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`home_player_6`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`home_player_7`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`home_player_8`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`home_player_9`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`home_player_10`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`home_player_11`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_1`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_2`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_3`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_4`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_5`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_6`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_7`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_8`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_9`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_10`) REFERENCES `Player`(`player_api_id`), FOREIGN KEY(`away_player_11`) REFERENCES `Player`(`player_api_id`) ) |
table | League | League | 24 | CREATE TABLE `League` ( `id` INTEGER PRIMARY KEY AUTOINCREMENT, `country_id` INTEGER, `name` TEXT UNIQUE, FOREIGN KEY(`country_id`) REFERENCES `country`(`id`) ) |
table | Country | Country | 26 | CREATE TABLE `Country` ( `id` INTEGER PRIMARY KEY AUTOINCREMENT, `name` TEXT UNIQUE ) |
table | Team | Team | 29 | CREATE TABLE "Team" ( `id` INTEGER PRIMARY KEY AUTOINCREMENT, `team_api_id` INTEGER UNIQUE, `team_fifa_api_id` INTEGER, `team_long_name` TEXT, `team_short_name` TEXT ) |
table | Team_Attributes | Team_Attributes | 2 | CREATE TABLE `Team_Attributes` ( `id` INTEGER PRIMARY KEY AUTOINCREMENT, `team_fifa_api_id` INTEGER, `team_api_id` INTEGER, `date` TEXT, `buildUpPlaySpeed` INTEGER, `buildUpPlaySpeedClass` TEXT, `buildUpPlayDribbling` INTEGER, `buildUpPlayDribblingClass` TEXT, `buildUpPlayPassing` INTEGER, `buildUpPlayPassingClass` TEXT, `buildUpPlayPositioningClass` TEXT, `chanceCreationPassing` INTEGER, `chanceCreationPassingClass` TEXT, `chanceCreationCrossing` INTEGER, `chanceCreationCrossingClass` TEXT, `chanceCreationShooting` INTEGER, `chanceCreationShootingClass` TEXT, `chanceCreationPositioningClass` TEXT, `defencePressure` INTEGER, `defencePressureClass` TEXT, `defenceAggression` INTEGER, `defenceAggressionClass` TEXT, `defenceTeamWidth` INTEGER, `defenceTeamWidthClass` TEXT, `defenceDefenderLineClass` TEXT, FOREIGN KEY(`team_fifa_api_id`) REFERENCES `Team`(`team_fifa_api_id`), FOREIGN KEY(`team_api_id`) REFERENCES `Team`(`team_api_id`) ) |
Extracting name of the tables
%%sql
SELECT name, type
FROM sqlite_master
WHERE type IN ("table", "view")
* sqlite:///database.sqlite Done.
name | type |
---|---|
sqlite_sequence | table |
Player_Attributes | table |
Player | table |
Match | table |
League | table |
Country | table |
Team | table |
Team_Attributes | table |
best_leagues | view |
Name of the leagues
%%sql
SELECT * FROM league;
* sqlite:///database.sqlite Done.
id | country_id | name |
---|---|---|
1 | 1 | Belgium Jupiler League |
1729 | 1729 | England Premier League |
4769 | 4769 | France Ligue 1 |
7809 | 7809 | Germany 1. Bundesliga |
10257 | 10257 | Italy Serie A |
13274 | 13274 | Netherlands Eredivisie |
15722 | 15722 | Poland Ekstraklasa |
17642 | 17642 | Portugal Liga ZON Sagres |
19694 | 19694 | Scotland Premier League |
21518 | 21518 | Spain LIGA BBVA |
24558 | 24558 | Switzerland Super League |
Note: As I mentioned above, I will focus on the 5 best known leagues:
* England Premier League
* France Ligue 1
* Italy Serie A
* Germany 1. Bundesliga
* Spain LIGA BBVA
I will create a view with the data of only these views
Creating a VIEW with the best leagues
%%sql
DROP VIEW IF EXISTS best_leagues;
CREATE VIEW best_leagues AS
SELECT id, country_id,
CASE
WHEN name = 'England Premier League' THEN 'Premier League'
WHEN name = 'France Ligue 1' THEN 'League 1'
WHEN name = 'Germany 1. Bundesliga' THEN 'Bundesliga'
WHEN name = 'Italy Serie A' THEN 'Serie A'
WHEN name = 'Spain LIGA BBVA' THEN 'La Liga'
END AS name
FROM league
WHERE name IN ('England Premier League', 'France Ligue 1', 'Germany 1. Bundesliga', 'Italy Serie A','Spain LIGA BBVA');
SELECT * FROM best_leagues;
* sqlite:///database.sqlite Done. Done. Done.
id | country_id | name |
---|---|---|
1729 | 1729 | Premier League |
4769 | 4769 | League 1 |
7809 | 7809 | Bundesliga |
10257 | 10257 | Serie A |
21518 | 21518 | La Liga |
Creating a VIEW with the teams of the biggest leagues
%%sql
SELECT t.* FROM team t LIMIT 1;
* sqlite:///database.sqlite Done.
id | team_api_id | team_fifa_api_id | team_long_name | team_short_name |
---|---|---|---|---|
1 | 9987 | 673 | KRC Genk | GEN |
%%sql
DROP VIEW IF EXISTS best_teams;
CREATE VIEW best_teams AS
SELECT t.* FROM team t
INNER JOIN Match m ON m.home_team_api_id = t.team_api_id
INNER JOIN best_leagues l ON l.id = m.league_id
GROUP BY t.team_long_name;
SELECT COUNT(*) FROM best_teams
* sqlite:///database.sqlite Done. Done. Done.
COUNT(*) |
---|
164 |
Creating a VIEW for the players of the best teams
I will be analyzing the players that played in the teams that belonged to the best leagues (the main 5 leagues)
%%sql
SELECT * FROM player LIMIT 0;
* sqlite:///database.sqlite Done.
id | player_api_id | player_name | player_fifa_api_id | birthday | height | weight |
---|
%%sql
SELECT * FROM match
WHERE season = '2011/2012' LIMIT 5;
* sqlite:///database.sqlite Done.
id | country_id | league_id | season | stage | date | match_api_id | home_team_api_id | away_team_api_id | home_team_goal | away_team_goal | home_player_X1 | home_player_X2 | home_player_X3 | home_player_X4 | home_player_X5 | home_player_X6 | home_player_X7 | home_player_X8 | home_player_X9 | home_player_X10 | home_player_X11 | away_player_X1 | away_player_X2 | away_player_X3 | away_player_X4 | away_player_X5 | away_player_X6 | away_player_X7 | away_player_X8 | away_player_X9 | away_player_X10 | away_player_X11 | home_player_Y1 | home_player_Y2 | home_player_Y3 | home_player_Y4 | home_player_Y5 | home_player_Y6 | home_player_Y7 | home_player_Y8 | home_player_Y9 | home_player_Y10 | home_player_Y11 | away_player_Y1 | away_player_Y2 | away_player_Y3 | away_player_Y4 | away_player_Y5 | away_player_Y6 | away_player_Y7 | away_player_Y8 | away_player_Y9 | away_player_Y10 | away_player_Y11 | home_player_1 | home_player_2 | home_player_3 | home_player_4 | home_player_5 | home_player_6 | home_player_7 | home_player_8 | home_player_9 | home_player_10 | home_player_11 | away_player_1 | away_player_2 | away_player_3 | away_player_4 | away_player_5 | away_player_6 | away_player_7 | away_player_8 | away_player_9 | away_player_10 | away_player_11 | goal | shoton | shotoff | foulcommit | card | cross | corner | possession | B365H | B365D | B365A | BWH | BWD | BWA | IWH | IWD | IWA | LBH | LBD | LBA | PSH | PSD | PSA | WHH | WHD | WHA | SJH | SJD | SJA | VCH | VCD | VCA | GBH | GBD | GBA | BSH | BSD | BSA |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
757 | 1 | 1 | 2011/2012 | 1 | 2011-07-29 00:00:00 | 1032692 | 1773 | 8635 | 2 | 1 | 1 | 2 | 4 | 6 | 8 | 3 | 5 | 7 | 3 | 5 | 7 | 1 | 3 | 4 | 7 | 1 | 3 | 5 | 7 | 9 | 4 | 6 | 1 | 3 | 3 | 3 | 3 | 7 | 7 | 7 | 10 | 10 | 10 | 1 | 3 | 3 | 3 | 7 | 7 | 7 | 7 | 7 | 10 | 10 | 37993 | 37865 | 37051 | 45840 | 179059 | 37981 | 38791 | 37963 | 38777 | 45865 | 68114 | 38391 | 38389 | 208493 | 149150 | 40536 | 38253 | 114333 | 178249 | 265123 | 46552 | 181276 | None | None | None | None | None | None | None | None | 7 | 4 | 1.5 | 6.5 | 4.1 | 1.42 | 5 | 3.7 | 1.5 | 5.5 | 3.6 | 1.5 | None | None | None | 5.5 | 3.75 | 1.57 | 6 | 3.6 | 1.57 | 7 | 4 | 1.5 | 6 | 3.75 | 1.5 | 6.5 | 4 | 1.44 |
758 | 1 | 1 | 2011/2012 | 1 | 2011-07-30 00:00:00 | 1032693 | 9998 | 9985 | 1 | 1 | 1 | 1 | 3 | 5 | 7 | 9 | 3 | 4 | 8 | 6 | 5 | 1 | 2 | 4 | 6 | 8 | 6 | 5 | 4 | 3 | 5 | 7 | 1 | 3 | 3 | 3 | 3 | 3 | 7 | 7 | 7 | 7 | 11 | 1 | 3 | 3 | 3 | 3 | 6 | 8 | 6 | 10 | 10 | 10 | 39153 | 39977 | 181574 | 166577 | 38906 | 36849 | 21753 | 37128 | 149258 | 26224 | 78902 | 38797 | 129462 | 245653 | 164229 | 33620 | 38969 | 17276 | 119117 | 38382 | 248689 | 46335 | None | None | None | None | None | None | None | None | 5 | 3.5 | 1.73 | 4.75 | 3.5 | 1.65 | 3.8 | 3.3 | 1.75 | 4.5 | 3.5 | 1.62 | None | None | None | 4.33 | 3.5 | 1.75 | 4.5 | 3.5 | 1.73 | 4.75 | 3.6 | 1.75 | 4.5 | 3.4 | 1.72 | 4.5 | 3.6 | 1.67 |
759 | 1 | 1 | 2011/2012 | 1 | 2011-07-30 00:00:00 | 1032694 | 9987 | 9993 | 3 | 1 | 1 | 2 | 4 | 6 | 8 | 2 | 4 | 6 | 8 | 4 | 6 | 1 | 2 | 4 | 6 | 8 | 3 | 5 | 7 | 3 | 5 | 7 | 1 | 3 | 3 | 3 | 3 | 7 | 7 | 7 | 7 | 10 | 10 | 1 | 3 | 3 | 3 | 3 | 7 | 7 | 7 | 10 | 10 | 10 | 91929 | 94462 | 38368 | 148314 | 109331 | 104411 | 39498 | 169200 | 43158 | 42153 | 38794 | 36873 | 57078 | 38800 | 174363 | 27508 | 38784 | 163613 | 38371 | 33622 | 166679 | 14487 | None | None | None | None | None | None | None | None | 1.44 | 4.33 | 7 | 1.45 | 3.95 | 6.25 | 1.5 | 3.7 | 5 | 1.44 | 3.75 | 6 | None | None | None | 1.4 | 4.33 | 7 | 1.44 | 4.2 | 6.5 | 1.44 | 4.5 | 7 | 1.45 | 4 | 6.25 | 1.44 | 4 | 6.5 |
760 | 1 | 1 | 2011/2012 | 1 | 2011-07-30 00:00:00 | 1032695 | 9991 | 9984 | 0 | 1 | 1 | 1 | 3 | 5 | 7 | 9 | 3 | 5 | 7 | 4 | 6 | 1 | 2 | 4 | 6 | 8 | 3 | 5 | 7 | 3 | 5 | 7 | 1 | 3 | 3 | 3 | 3 | 3 | 7 | 7 | 7 | 10 | 10 | 1 | 3 | 3 | 3 | 3 | 7 | 7 | 7 | 10 | 10 | 10 | 37854 | 12473 | 114368 | 178096 | 37440 | None | 33662 | 26771 | 166618 | 148329 | 12574 | 36835 | 38342 | 243250 | 37047 | 38789 | 27110 | 166670 | 188231 | 277766 | 38251 | 209855 | None | None | None | None | None | None | None | None | 1.57 | 3.8 | 6 | 1.55 | 3.85 | 5 | 1.55 | 3.5 | 4.8 | 1.57 | 3.5 | 5 | None | None | None | 1.6 | 3.75 | 5 | 1.5 | 3.9 | 6 | 1.53 | 4 | 6 | 1.55 | 3.75 | 5.5 | 1.53 | 3.75 | 5.5 |
761 | 1 | 1 | 2011/2012 | 1 | 2011-07-30 00:00:00 | 1032696 | 9994 | 10000 | 0 | 0 | 1 | 3 | 4 | 7 | 1 | 3 | 5 | 7 | 9 | 4 | 6 | 1 | 1 | 3 | 5 | 7 | 9 | 3 | 5 | 7 | 4 | 6 | 1 | 3 | 3 | 3 | 7 | 7 | 7 | 7 | 7 | 10 | 10 | 1 | 3 | 3 | 3 | 3 | 3 | 7 | 7 | 7 | 10 | 10 | 30934 | 94030 | 25791 | 166675 | 95609 | 38290 | 67898 | 30910 | 42706 | 104406 | 104404 | 37900 | 37100 | 41005 | 46877 | 80678 | 37886 | 131530 | 208984 | None | 208852 | 240044 | None | None | None | None | None | None | None | None | 2.2 | 3.3 | 3.3 | 2.15 | 3.25 | 3.05 | 2.1 | 3.1 | 3 | 2 | 3.2 | 3.2 | None | None | None | 2 | 3.3 | 3.4 | 2.2 | 3.2 | 3.2 | 2.2 | 3.3 | 3.3 | 2.15 | 3.25 | 3.1 | 2.2 | 3.3 | 2.88 |
%%sql
DROP VIEW IF EXISTS best_players;
CREATE VIEW best_players AS
WITH pl AS
(
SELECT home_player_1, home_player_2, home_player_3, home_player_4, home_player_5, home_player_6,
home_player_7, home_player_8, home_player_9, home_player_10, home_player_11, away_player_1,
away_player_2, away_player_3, away_player_4, away_player_5, away_player_6, away_player_7,
away_player_8, away_player_9, away_player_10, away_player_11
FROM Match
INNER JOIN best_leagues ON best_leagues.id = Match.league_id
)
SELECT id, player_api_id, player_name, player_fifa_api_id, date(birthday), substr(birthday, 1,4) AS birth_year,
substr(birthday, 6,2) AS birth_month, substr(birthday, 9,2) AS birth_day, height, weight
FROM player WHERE player_api_id IN (SELECT home_player_1 FROM pl)
OR player_api_id IN (SELECT home_player_2 FROM pl)
OR player_api_id IN (SELECT home_player_3 FROM pl)
OR player_api_id IN (SELECT home_player_4 FROM pl)
OR player_api_id IN (SELECT home_player_5 FROM pl)
OR player_api_id IN (SELECT home_player_6 FROM pl)
OR player_api_id IN (SELECT home_player_7 FROM pl)
OR player_api_id IN (SELECT home_player_8 FROM pl)
OR player_api_id IN (SELECT home_player_9 FROM pl)
OR player_api_id IN (SELECT home_player_10 FROM pl)
OR player_api_id IN (SELECT home_player_11 FROM pl)
OR player_api_id IN (SELECT away_player_1 FROM pl)
OR player_api_id IN (SELECT away_player_2 FROM pl)
OR player_api_id IN (SELECT away_player_3 FROM pl)
OR player_api_id IN (SELECT away_player_4 FROM pl)
OR player_api_id IN (SELECT away_player_5 FROM pl)
OR player_api_id IN (SELECT away_player_6 FROM pl)
OR player_api_id IN (SELECT away_player_7 FROM pl)
OR player_api_id IN (SELECT away_player_8 FROM pl)
OR player_api_id IN (SELECT away_player_9 FROM pl)
OR player_api_id IN (SELECT away_player_10 FROM pl)
OR player_api_id IN (SELECT away_player_11 FROM pl)
;
SELECT COUNT(*) FROM best_players;
* sqlite:///database.sqlite Done. Done. Done.
COUNT(*) |
---|
6102 |
Best_players
%%sql
SELECT * FROM best_players LIMIT 5;
* sqlite:///database.sqlite Done.
id | player_api_id | player_name | player_fifa_api_id | date(birthday) | birth_year | birth_month | birth_day | height | weight |
---|---|---|---|---|---|---|---|---|---|
9677 | 2984 | Sergio Aragones | 111106 | 1977-02-01 | 1977 | 02 | 01 | 182.88 | 176 |
6856 | 5440 | Marjan Petkovic | 122118 | 1979-05-22 | 1979 | 05 | 22 | 185.42 | 185 |
4784 | 11316 | Jean-Louis Leca | 153275 | 1985-09-21 | 1985 | 09 | 21 | 180.34 | 165 |
1232 | 11319 | Benoit Costil | 158121 | 1987-07-03 | 1987 | 07 | 03 | 187.96 | 190 |
9895 | 11320 | Steeve Elana | 111163 | 1980-07-11 | 1980 | 07 | 11 | 187.96 | 187 |
Best_teams
%%sql
SELECT * FROM best_teams LIMIT 5;
* sqlite:///database.sqlite Done.
id | team_api_id | team_fifa_api_id | team_long_name | team_short_name |
---|---|---|---|---|
3457 | 10260 | 11 | Manchester United | MUN |
3459 | 9825 | 1 | Arsenal | ARS |
3461 | 8472 | 106 | Sunderland | SUN |
3463 | 8654 | 19 | West Ham United | WHU |
3465 | 10252 | 2 | Aston Villa | AVL |
Best_leagues
%%sql
SELECT * FROM best_leagues;
* sqlite:///database.sqlite Done.
id | country_id | name |
---|---|---|
1729 | 1729 | Premier League |
4769 | 4769 | League 1 |
7809 | 7809 | Bundesliga |
10257 | 10257 | Serie A |
21518 | 21518 | La Liga |
As a result of my wrangling analysis, 3 different views were created to diminish the amount of data to the one of my interest for my analysis. New columns such as birth_year, birth_month and birth_day were added to facilitate future calculations and analysis. Name of the leagues were modified to their popular name (how must of people call them).
Major problems such as null data were not identify. From the source I had the information of no null data in this dataset. Structure problems or grammar issues were not visually identified. As the dataset is very long, possible grammar issues in the data may be encountered during the future analysis and will be addressed them.
Exploring the attributes table
%%sql
SELECT * FROM Team_attributes LIMIT 10;
* sqlite:///database.sqlite Done.
id | team_fifa_api_id | team_api_id | date | buildUpPlaySpeed | buildUpPlaySpeedClass | buildUpPlayDribbling | buildUpPlayDribblingClass | buildUpPlayPassing | buildUpPlayPassingClass | buildUpPlayPositioningClass | chanceCreationPassing | chanceCreationPassingClass | chanceCreationCrossing | chanceCreationCrossingClass | chanceCreationShooting | chanceCreationShootingClass | chanceCreationPositioningClass | defencePressure | defencePressureClass | defenceAggression | defenceAggressionClass | defenceTeamWidth | defenceTeamWidthClass | defenceDefenderLineClass |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 434 | 9930 | 2010-02-22 00:00:00 | 60 | Balanced | None | Little | 50 | Mixed | Organised | 60 | Normal | 65 | Normal | 55 | Normal | Organised | 50 | Medium | 55 | Press | 45 | Normal | Cover |
2 | 434 | 9930 | 2014-09-19 00:00:00 | 52 | Balanced | 48 | Normal | 56 | Mixed | Organised | 54 | Normal | 63 | Normal | 64 | Normal | Organised | 47 | Medium | 44 | Press | 54 | Normal | Cover |
3 | 434 | 9930 | 2015-09-10 00:00:00 | 47 | Balanced | 41 | Normal | 54 | Mixed | Organised | 54 | Normal | 63 | Normal | 64 | Normal | Organised | 47 | Medium | 44 | Press | 54 | Normal | Cover |
4 | 77 | 8485 | 2010-02-22 00:00:00 | 70 | Fast | None | Little | 70 | Long | Organised | 70 | Risky | 70 | Lots | 70 | Lots | Organised | 60 | Medium | 70 | Double | 70 | Wide | Cover |
5 | 77 | 8485 | 2011-02-22 00:00:00 | 47 | Balanced | None | Little | 52 | Mixed | Organised | 53 | Normal | 48 | Normal | 52 | Normal | Organised | 47 | Medium | 47 | Press | 52 | Normal | Cover |
6 | 77 | 8485 | 2012-02-22 00:00:00 | 58 | Balanced | None | Little | 62 | Mixed | Organised | 45 | Normal | 70 | Lots | 55 | Normal | Organised | 40 | Medium | 40 | Press | 60 | Normal | Cover |
7 | 77 | 8485 | 2013-09-20 00:00:00 | 62 | Balanced | None | Little | 45 | Mixed | Organised | 40 | Normal | 50 | Normal | 55 | Normal | Organised | 42 | Medium | 42 | Press | 60 | Normal | Cover |
8 | 77 | 8485 | 2014-09-19 00:00:00 | 58 | Balanced | 64 | Normal | 62 | Mixed | Organised | 56 | Normal | 68 | Lots | 57 | Normal | Organised | 41 | Medium | 42 | Press | 60 | Normal | Cover |
9 | 77 | 8485 | 2015-09-10 00:00:00 | 59 | Balanced | 64 | Normal | 53 | Mixed | Organised | 51 | Normal | 72 | Lots | 63 | Normal | Free Form | 49 | Medium | 45 | Press | 63 | Normal | Cover |
10 | 614 | 8576 | 2010-02-22 00:00:00 | 60 | Balanced | None | Little | 40 | Mixed | Organised | 45 | Normal | 35 | Normal | 55 | Normal | Organised | 30 | Deep | 70 | Double | 30 | Narrow | Offside Trap |
Let's understand better our attributes and classes
I will extract the range of the classes of our attributes
%%sql
SELECT buildUpPlaySpeedClass, MIN(buildUpPlaySpeed) AS Min_Points, MAX(buildUpPlaySpeed) AS Max_Points
FROM team_attributes
GROUP BY buildUpPlaySpeedClass;
* sqlite:///database.sqlite Done.
buildUpPlaySpeedClass | Min_Points | Max_Points |
---|---|---|
Balanced | 34 | 66 |
Fast | 67 | 80 |
Slow | 20 | 33 |
%%sql
SELECT buildUpPlayDribblingClass, MIN(buildUpPlayDribbling) AS Min_Points, MAX(buildUpPlayDribbling) AS Max_Points
FROM team_attributes
GROUP BY buildUpPlayDribblingClass;
* sqlite:///database.sqlite Done.
buildUpPlayDribblingClass | Min_Points | Max_Points |
---|---|---|
Little | 24 | 33 |
Lots | 67 | 77 |
Normal | 34 | 66 |
%%sql
SELECT buildUpPlayPassingClass, MIN(buildUpPlayPassing) AS Min_Points, MAX(buildUpPlayPassing) AS Max_Points
FROM team_attributes
GROUP BY buildUpPlayPassingClass;
* sqlite:///database.sqlite Done.
buildUpPlayPassingClass | Min_Points | Max_Points |
---|---|---|
Long | 67 | 80 |
Mixed | 34 | 66 |
Short | 20 | 33 |
%%sql
SELECT chanceCreationPassingClass, MIN(chanceCreationPassing) AS Min_Points, MAX(chanceCreationPassing) AS Max_Points
FROM team_attributes
GROUP BY chanceCreationPassingClass;
* sqlite:///database.sqlite Done.
chanceCreationPassingClass | Min_Points | Max_Points |
---|---|---|
Normal | 34 | 66 |
Risky | 67 | 80 |
Safe | 21 | 33 |
%%sql
SELECT chanceCreationCrossingClass, MIN(chanceCreationCrossing) AS Min_Points, MAX(chanceCreationCrossing) AS Max_Points
FROM team_attributes
GROUP BY chanceCreationCrossingClass;
* sqlite:///database.sqlite Done.
chanceCreationCrossingClass | Min_Points | Max_Points |
---|---|---|
Little | 20 | 33 |
Lots | 67 | 80 |
Normal | 34 | 66 |
%%sql
SELECT chanceCreationShootingClass, MIN(chanceCreationShooting) AS Min_Points, MAX(chanceCreationShooting) AS Max_Points
FROM team_attributes
GROUP BY chanceCreationShootingClass;
* sqlite:///database.sqlite Done.
chanceCreationShootingClass | Min_Points | Max_Points |
---|---|---|
Little | 22 | 33 |
Lots | 67 | 80 |
Normal | 34 | 66 |
%%sql
SELECT defencePressureClass, MIN(defencePressure) AS Min_Points, MAX(defencePressure) AS Max_Points
FROM team_attributes
GROUP BY defencePressureClass;
* sqlite:///database.sqlite Done.
defencePressureClass | Min_Points | Max_Points |
---|---|---|
Deep | 23 | 33 |
High | 67 | 72 |
Medium | 34 | 66 |
%%sql
SELECT defenceAggressionClass, MIN(defenceAggression) AS Min_Points, MAX(defenceAggression) AS Max_Points
FROM team_attributes
GROUP BY defenceAggressionClass;
* sqlite:///database.sqlite Done.
defenceAggressionClass | Min_Points | Max_Points |
---|---|---|
Contain | 24 | 33 |
Double | 67 | 72 |
Press | 34 | 66 |
%%sql
SELECT defenceTeamWidthClass, MIN(defenceTeamWidth) AS Min_Points, MAX(defenceTeamWidth) AS Max_Points
FROM team_attributes
GROUP BY defenceTeamWidthClass;
* sqlite:///database.sqlite Done.
defenceTeamWidthClass | Min_Points | Max_Points |
---|---|---|
Narrow | 29 | 33 |
Normal | 34 | 66 |
Wide | 67 | 73 |
Before continuing, I will define the summarized attributes according to FIFA in order to understand how they define and are related to the victory or defeat of a team and its style of play.
Important Note: It is fair to say that in football better statistical results not always lead to more victories. Some attributes have more weight in this factor than others. Outside factors such as luck, mistake of players and referees during the game are not taken into account inside this data. So, per safe of this analysis, I will trust the number provided by the attributes in order to define the best teams, at least on these terms.
buildUpPlaySpeed:
Define the speed in which attacks are put together:
Values | Category |
---|---|
1 - 33 | SLOW |
34 - 66 | BALANCED |
67 - 100 | FAST |
buildUpPassing:
Affects passing distance and support from teammates:
Values | Category |
---|---|
1 - 33 | SHORT |
34 - 66 | MIXED |
67 - 100 | LONG |
buildUpDribbling:
This parameter defines the creativity of the player in 1 on 1 situations.
Values | Category |
---|---|
1 - 33 | LITTLE |
34 - 66 | NORMAL |
67 - 100 | LOTS |
ChanceCreationPassingClass:
Amount of risk in pass decision and run support:
Values | Category |
---|---|
1 - 33 | SAFE |
34 - 66 | NORMAL |
67 - 100 | RISKY |
ChanceCreationCrossingClass:
The tendency / frequency of crosses into the box
Values | Category |
---|---|
1 - 33 | LITTLE |
34 - 66 | NORMAL |
67 - 100 | LOTS |
ChanceCreationShootingClass:
The tendency / frequency of shots taken:
Values | Category |
---|---|
1 - 33 | LITTLE |
34 - 66 | NORMAL |
67 - 100 | LOTS |
DefencePressureClass
Defines how hight he pitch the team will start pressuring:
Values | Category |
---|---|
1 - 33 | DEEP |
34 - 66 | MEDIUM |
67 - 100 | HIGH |
DefenceAggressionClass:
Defines the team approach to tackling the ball possessor:
Values | Category |
---|---|
1 - 33 | CONTAIN |
34 - 66 | PRESS |
67 - 100 | DOUBLE |
DefenceTeamWidthClass
Defines how much the team shift to the ball side. The narrower width means that the team tends to cover central position while the wider teams tend to cover more the wings/sides.
Values | Category |
---|---|
1 - 33 | NARROW |
34 - 66 | NORMAL |
67 - 100 | WIDE |
%%sql
SELECT l.name AS League, m.Season AS Season,
ROUND(AVG(a.buildUpPlaySpeed),2) AS Avg_PlaySpeed, MAX(a.buildUpPlaySpeed) AS Max_PlaySpeed,
MIN(a.buildUpPlaySpeed) AS Min_PlaySpeed,
RANK()
OVER (PARTITION BY Season ORDER BY ROUND(AVG(a.buildUpPlaySpeed),2) DESC) AS Ranking
FROM best_leagues l
INNER JOIN Match m ON m.league_id = l.id
INNER JOIN Team t ON t.team_api_id = m.home_team_api_id
INNER JOIN Team_Attributes a ON a.team_api_id = t.team_api_id
GROUP BY League, Season;
* sqlite:///database.sqlite Done.
League | Season | Avg_PlaySpeed | Max_PlaySpeed | Min_PlaySpeed | Ranking |
---|---|---|---|---|---|
Premier League | 2008/2009 | 56.77 | 77 | 25 | 1 |
Bundesliga | 2008/2009 | 56.51 | 78 | 31 | 2 |
Serie A | 2008/2009 | 56.01 | 78 | 26 | 3 |
League 1 | 2008/2009 | 53.57 | 70 | 30 | 4 |
La Liga | 2008/2009 | 47.7 | 71 | 20 | 5 |
Bundesliga | 2009/2010 | 56.39 | 78 | 31 | 1 |
Premier League | 2009/2010 | 56.12 | 77 | 25 | 2 |
Serie A | 2009/2010 | 55.3 | 78 | 26 | 3 |
League 1 | 2009/2010 | 53.47 | 70 | 30 | 4 |
La Liga | 2009/2010 | 47.2 | 70 | 20 | 5 |
Bundesliga | 2010/2011 | 56.61 | 78 | 31 | 1 |
Premier League | 2010/2011 | 56.43 | 77 | 25 | 2 |
Serie A | 2010/2011 | 54.82 | 78 | 26 | 3 |
League 1 | 2010/2011 | 53.21 | 70 | 23 | 4 |
La Liga | 2010/2011 | 46.98 | 70 | 20 | 5 |
Bundesliga | 2011/2012 | 56.75 | 78 | 31 | 1 |
Serie A | 2011/2012 | 56.45 | 78 | 26 | 2 |
Premier League | 2011/2012 | 55.52 | 75 | 25 | 3 |
League 1 | 2011/2012 | 53.88 | 70 | 30 | 4 |
La Liga | 2011/2012 | 46.72 | 70 | 20 | 5 |
Bundesliga | 2012/2013 | 57.06 | 78 | 31 | 1 |
Serie A | 2012/2013 | 56.01 | 78 | 26 | 2 |
Premier League | 2012/2013 | 55.57 | 77 | 25 | 3 |
League 1 | 2012/2013 | 53.18 | 70 | 34 | 4 |
La Liga | 2012/2013 | 47.11 | 70 | 20 | 5 |
Bundesliga | 2013/2014 | 57.0 | 78 | 31 | 1 |
Premier League | 2013/2014 | 56.87 | 77 | 25 | 2 |
Serie A | 2013/2014 | 55.5 | 78 | 26 | 3 |
League 1 | 2013/2014 | 53.29 | 70 | 34 | 4 |
La Liga | 2013/2014 | 46.63 | 70 | 20 | 5 |
Bundesliga | 2014/2015 | 57.88 | 78 | 31 | 1 |
Premier League | 2014/2015 | 56.26 | 77 | 25 | 2 |
Serie A | 2014/2015 | 55.64 | 78 | 26 | 3 |
League 1 | 2014/2015 | 53.11 | 70 | 30 | 4 |
La Liga | 2014/2015 | 46.55 | 70 | 20 | 5 |
Bundesliga | 2015/2016 | 57.56 | 78 | 31 | 1 |
Premier League | 2015/2016 | 56.66 | 77 | 25 | 2 |
Serie A | 2015/2016 | 56.23 | 80 | 26 | 3 |
League 1 | 2015/2016 | 52.39 | 70 | 34 | 4 |
La Liga | 2015/2016 | 47.53 | 70 | 20 | 5 |
Top 5 teams with better play Passing score per league
Ranking the teams with better passing, in other words, those whose player's passes have higher efficacy. Good pass score leads to a player whose passes reaches his teammates
%%sql
WITH stats AS(
SELECT l.name AS League, t.team_long_name AS team_name, ROUND(AVG(a.buildUpPlayPassing),2) AS PlayPassing_Score,
RANK()
OVER (PARTITION BY l.name ORDER BY ROUND(AVG(a.buildUpPlayPassing),2) DESC ) AS Ranking
FROM best_leagues l
INNER JOIN Match m ON m.league_id = l.id
INNER JOIN Team t ON t.team_api_id = m.home_team_api_id
INNER JOIN Team_Attributes a ON a.team_api_id = t.team_api_id
GROUP BY League, team_name
)
SELECT * FROM (SELECT * FROM stats WHERE League = 'Bundesliga' LIMIT 5)
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'League 1' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'La Liga' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'Premier League' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'Serie A' LIMIT 5 )
;
* sqlite:///database.sqlite Done.
League | team_name | PlayPassing_Score | Ranking |
---|---|---|---|
Bundesliga | SV Darmstadt 98 | 77.0 | 1 |
Bundesliga | 1. FC Köln | 61.17 | 2 |
Bundesliga | DSC Arminia Bielefeld | 59.33 | 3 |
Bundesliga | Eintracht Braunschweig | 56.0 | 4 |
Bundesliga | FC St. Pauli | 55.67 | 5 |
League 1 | OGC Nice | 61.83 | 1 |
League 1 | Grenoble Foot 38 | 60.0 | 2 |
League 1 | Toulouse FC | 55.33 | 3 |
League 1 | AJ Auxerre | 55.17 | 4 |
League 1 | Valenciennes FC | 54.0 | 5 |
La Liga | UD Almería | 60.5 | 1 |
La Liga | Levante UD | 56.67 | 2 |
La Liga | Sevilla FC | 56.0 | 3 |
La Liga | Elche CF | 55.33 | 4 |
La Liga | Real Valladolid | 53.67 | 5 |
Premier League | Stoke City | 67.0 | 1 |
Premier League | Birmingham City | 66.0 | 2 |
Premier League | Bolton Wanderers | 63.83 | 3 |
Premier League | Blackburn Rovers | 63.83 | 3 |
Premier League | Cardiff City | 62.17 | 5 |
Serie A | Chievo Verona | 57.67 | 1 |
Serie A | Inter | 57.17 | 2 |
Serie A | Livorno | 53.17 | 3 |
Serie A | Udinese | 53.0 | 4 |
Serie A | Sassuolo | 51.67 | 5 |
Teams with better builUpDribbling
Ranking the teams per each league with best one to one players score
%%sql
WITH stats AS(
SELECT l.name AS League, t.team_long_name AS team_name, ROUND(AVG(a.buildUpPlayDribbling),2) AS PlayDribbling_Score,
RANK()
OVER (PARTITION BY l.name ORDER BY ROUND(AVG(a.buildUpPlayDribbling),2) DESC ) AS Ranking
FROM best_leagues l
INNER JOIN Match m ON m.league_id = l.id
INNER JOIN Team t ON t.team_api_id = m.home_team_api_id
INNER JOIN Team_Attributes a ON a.team_api_id = t.team_api_id
GROUP BY League, team_name
)
SELECT * FROM (SELECT * FROM stats WHERE League = 'Bundesliga' LIMIT 5)
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'League 1' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'La Liga' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'Premier League' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM stats WHERE League = 'Serie A' LIMIT 5 )
;
* sqlite:///database.sqlite Done.
League | team_name | PlayDribbling_Score | Ranking |
---|---|---|---|
Bundesliga | Fortuna Düsseldorf | 65.5 | 1 |
Bundesliga | SpVgg Greuther Fürth | 60.0 | 2 |
Bundesliga | 1. FC Nürnberg | 60.0 | 2 |
Bundesliga | Hamburger SV | 58.5 | 4 |
Bundesliga | VfB Stuttgart | 58.0 | 5 |
League 1 | Olympique de Marseille | 69.5 | 1 |
League 1 | Stade Rennais FC | 62.0 | 2 |
League 1 | FC Sochaux-Montbéliard | 62.0 | 2 |
League 1 | FC Nantes | 61.5 | 4 |
League 1 | SC Bastia | 61.0 | 5 |
La Liga | RC Celta de Vigo | 56.5 | 1 |
La Liga | Granada CF | 56.0 | 2 |
La Liga | Real Madrid CF | 55.5 | 3 |
La Liga | UD Las Palmas | 55.0 | 4 |
La Liga | RC Deportivo de La Coruña | 55.0 | 4 |
Premier League | Liverpool | 54.0 | 1 |
Premier League | Arsenal | 51.0 | 2 |
Premier League | Chelsea | 46.5 | 3 |
Premier League | Blackburn Rovers | 45.5 | 4 |
Premier League | Tottenham Hotspur | 40.5 | 5 |
Serie A | Napoli | 72.0 | 1 |
Serie A | Milan | 70.0 | 2 |
Serie A | Torino | 69.0 | 3 |
Serie A | Parma | 68.5 | 4 |
Serie A | Sassuolo | 68.0 | 5 |
Creating a view with the teams who won and lossed the match
New columns:
who_loss
: team_api_id of the team who lost the game
who_win
: team_api_id of the tean who won the game `
%%sql
DROP VIEW IF EXISTS win_loss;
CREATE VIEW win_loss AS
SELECT season, home_team_api_id, away_team_api_id, league_id,
CASE WHEN home_team_goal > away_team_goal THEN home_team_api_id
WHEN away_team_goal > home_team_goal THEN away_team_api_id
ELSE 'draw'
END who_won,
CASE WHEN home_team_goal > away_team_goal THEN away_team_api_id
WHEN away_team_goal > home_team_goal THEN home_team_api_id
ELSE 'draw'
END who_lose
FROM Match;
* sqlite:///database.sqlite Done. Done.
[]
Creating a view with the ranking of teams according to their victories per season
Columns:
League, Team, Wins, Rank
%%sql
DROP VIEW IF EXISTS victories_ranking;
CREATE VIEW victories_ranking AS
SELECT w.season, t.team_api_id, l.name AS League, t.team_long_name AS Team, COUNT(w.who_won) AS Wins,
RANK()
OVER(PARTITION BY w.season ORDER BY COUNT(w.who_won) DESC) AS ranking
FROM win_loss w
INNER JOIN best_leagues l ON l.id = w.league_id
INNER JOIN best_teams t ON t.team_api_id = w.who_won
GROUP BY w.season, w.who_won;
* sqlite:///database.sqlite Done. Done.
[]
Creating a view with the ranking of teams according to their losses
Columns:
League, Team, Losses, Rank
%%sql
DROP VIEW IF EXISTS losses_ranking;
CREATE VIEW losses_ranking AS
SELECT t.team_api_id, l.name AS League, t.team_long_name AS Team, COUNT(w.who_lose) AS Losses,
RANK()
OVER(PARTITION BY w.season ORDER BY COUNT(w.who_lose) DESC) AS ranking
FROM win_loss w
JOIN best_leagues l ON l.id = w.league_id
JOIN best_teams t ON t.team_api_id = w.who_lose
GROUP BY w.season, w.who_lose;
* sqlite:///database.sqlite Done. Done.
[]
Image taken from Trollfootball
Season 2008/2009
%%sql
SELECT * FROM (SELECT * FROM victories_ranking WHERE season = '2008/2009' ORDER BY Ranking LIMIT 5)
* sqlite:///database.sqlite Done.
season | team_api_id | League | Team | Wins | ranking |
---|---|---|---|---|---|
2008/2009 | 10260 | Premier League | Manchester United | 28 | 1 |
2008/2009 | 8634 | La Liga | FC Barcelona | 27 | 2 |
2008/2009 | 8650 | Premier League | Liverpool | 25 | 3 |
2008/2009 | 8636 | Serie A | Inter | 25 | 3 |
2008/2009 | 8633 | La Liga | Real Madrid CF | 25 | 3 |
Season 2009/2010
%%sql
SELECT * FROM (SELECT * FROM victories_ranking WHERE season = '2009/2010' ORDER BY Ranking LIMIT 5)
* sqlite:///database.sqlite Done.
season | team_api_id | League | Team | Wins | ranking |
---|---|---|---|---|---|
2009/2010 | 8634 | La Liga | FC Barcelona | 31 | 1 |
2009/2010 | 8633 | La Liga | Real Madrid CF | 31 | 1 |
2009/2010 | 10260 | Premier League | Manchester United | 27 | 3 |
2009/2010 | 8455 | Premier League | Chelsea | 27 | 3 |
2009/2010 | 8686 | Serie A | Roma | 24 | 5 |
Season 2010/2011
%%sql
SELECT * FROM (SELECT * FROM victories_ranking WHERE season = '2010/2011' ORDER BY Ranking LIMIT 5)
* sqlite:///database.sqlite Done.
season | team_api_id | League | Team | Wins | ranking |
---|---|---|---|---|---|
2010/2011 | 8634 | La Liga | FC Barcelona | 30 | 1 |
2010/2011 | 8633 | La Liga | Real Madrid CF | 29 | 2 |
2010/2011 | 8564 | Serie A | Milan | 24 | 3 |
2010/2011 | 10260 | Premier League | Manchester United | 23 | 4 |
2010/2011 | 9789 | Bundesliga | Borussia Dortmund | 23 | 4 |
Season 2011/2012
%%sql
SELECT * FROM (SELECT * FROM victories_ranking WHERE season = '2011/2012' ORDER BY Ranking LIMIT 5)
* sqlite:///database.sqlite Done.
season | team_api_id | League | Team | Wins | ranking |
---|---|---|---|---|---|
2011/2012 | 8633 | La Liga | Real Madrid CF | 32 | 1 |
2011/2012 | 10260 | Premier League | Manchester United | 28 | 2 |
2011/2012 | 8634 | La Liga | FC Barcelona | 28 | 2 |
2011/2012 | 8456 | Premier League | Manchester City | 28 | 2 |
2011/2012 | 10249 | League 1 | Montpellier Hérault SC | 25 | 5 |
Season 2012/2013
%%sql
SELECT * FROM (SELECT * FROM victories_ranking WHERE season = '2012/2013' ORDER BY Ranking LIMIT 5)
* sqlite:///database.sqlite Done.
season | team_api_id | League | Team | Wins | ranking |
---|---|---|---|---|---|
2012/2013 | 8634 | La Liga | FC Barcelona | 32 | 1 |
2012/2013 | 9823 | Bundesliga | FC Bayern Munich | 29 | 2 |
2012/2013 | 10260 | Premier League | Manchester United | 28 | 3 |
2012/2013 | 9885 | Serie A | Juventus | 27 | 4 |
2012/2013 | 8633 | La Liga | Real Madrid CF | 26 | 5 |
Season 2013/2014
%%sql
SELECT * FROM (SELECT * FROM victories_ranking WHERE season = '2013/2014' ORDER BY Ranking LIMIT 5)
* sqlite:///database.sqlite Done.
season | team_api_id | League | Team | Wins | ranking |
---|---|---|---|---|---|
2013/2014 | 9885 | Serie A | Juventus | 33 | 1 |
2013/2014 | 9823 | Bundesliga | FC Bayern Munich | 29 | 2 |
2013/2014 | 9906 | La Liga | Atlético Madrid | 28 | 3 |
2013/2014 | 9847 | League 1 | Paris Saint-Germain | 27 | 4 |
2013/2014 | 8634 | La Liga | FC Barcelona | 27 | 4 |
Season 2014/2015
%%sql
SELECT * FROM (SELECT * FROM victories_ranking WHERE season = '2014/2015' ORDER BY Ranking LIMIT 5)
* sqlite:///database.sqlite Done.
season | team_api_id | League | Team | Wins | ranking |
---|---|---|---|---|---|
2014/2015 | 8634 | La Liga | FC Barcelona | 30 | 1 |
2014/2015 | 8633 | La Liga | Real Madrid CF | 30 | 1 |
2014/2015 | 9885 | Serie A | Juventus | 26 | 3 |
2014/2015 | 8455 | Premier League | Chelsea | 26 | 3 |
2014/2015 | 9823 | Bundesliga | FC Bayern Munich | 25 | 5 |
In the analysis of the best teams per season, we can clearly notice there are two teams that most of times are at the top. This is the case of Real Madrid and Barcelona. In the last century, these teams have been the greatest rivals in the football, being their matches between them the most popular and most seen around the world, also called as 'The Classic'. They both belongs to 'La Liga' league. Also, Manchester United is the one of the most popular teams. All of them: Juventus, Bayern Munich, Barcelona, Real Madrid and the others have been the best teams in their leagues during these years.
Next I analyze some of the attributes of the teams with more losses during the seasons, so later I can compared them to the teams with more wins to study which playstyles and attributes had a higher influence in the achievemenbt of victory.
%%sql
WITH wins AS (
SELECT l.name AS League, t.team_long_name AS Team, COUNT(w.who_lose) AS Loss,
a.buildUpPlaySpeedClass AS speed_class, a.buildUpPlayDribblingClass AS dribbling_class,
a.buildUpPlayPassingClass AS passing_class, a.chanceCreationPassingClass AS creation_passing_class,
a.chanceCreationShootingCLass AS creation_shooting_class, a.defencePressureClass AS pressure_class,
a.defenceAggressionClass AS agression_class, a.defenceTeamWidthClass AS team_width_class,
a.defenceDefenderLineClass AS defender_line_class, a.buildUpPlayPositioningClass AS possition_class
FROM win_loss w
JOIN best_leagues l ON l.id = w.league_id
JOIN best_teams t ON t.team_api_id = w.who_lose
JOIN Team_Attributes a ON a.team_api_id = w.who_lose
GROUP BY w.who_lose
ORDER BY COUNT(w.who_lose) DESC
)
SELECT * FROM (SELECT * FROM wins WHERE League = 'Bundesliga' LIMIT 5)
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'League 1' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'La Liga' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'Premier League' LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'Serie A' LIMIT 5 )
* sqlite:///database.sqlite Done.
League | Team | Loss | speed_class | dribbling_class | passing_class | creation_passing_class | creation_shooting_class | pressure_class | agression_class | team_width_class | defender_line_class | possition_class |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Bundesliga | Hannover 96 | 744 | Balanced | Little | Long | Normal | Normal | High | Double | Wide | Cover | Organised |
Bundesliga | Hamburger SV | 678 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Bundesliga | VfB Stuttgart | 672 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Bundesliga | TSG 1899 Hoffenheim | 654 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Bundesliga | SV Werder Bremen | 642 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Wide | Cover | Organised |
League 1 | OGC Nice | 702 | Balanced | Little | Mixed | Normal | Normal | Deep | Press | Normal | Cover | Organised |
League 1 | FC Lorient | 696 | Balanced | Little | Long | Safe | Normal | High | Press | Wide | Offside Trap | Organised |
League 1 | Toulouse FC | 666 | Balanced | Little | Long | Normal | Normal | Deep | Press | Narrow | Cover | Organised |
League 1 | FC Sochaux-Montbéliard | 612 | Balanced | Little | Long | Normal | Normal | Medium | Contain | Normal | Cover | Organised |
League 1 | Stade Rennais FC | 612 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
La Liga | Getafe CF | 834 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
La Liga | RCD Espanyol | 804 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
La Liga | Málaga CF | 720 | Balanced | Little | Mixed | Normal | Lots | Medium | Press | Normal | Cover | Organised |
La Liga | Athletic Club de Bilbao | 666 | Balanced | Little | Mixed | Normal | Normal | Deep | Double | Normal | Cover | Organised |
La Liga | Levante UD | 618 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Premier League | Sunderland | 804 | Balanced | Little | Long | Risky | Normal | Medium | Double | Normal | Cover | Organised |
Premier League | Aston Villa | 780 | Balanced | Little | Long | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Premier League | Stoke City | 720 | Balanced | Little | Long | Risky | Normal | Medium | Double | Normal | Cover | Organised |
Premier League | Newcastle United | 714 | Balanced | Little | Long | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Premier League | West Bromwich Albion | 708 | Balanced | Little | Mixed | Normal | Normal | Deep | Contain | Normal | Cover | Organised |
Serie A | Chievo Verona | 774 | Balanced | Little | Long | Normal | Normal | Deep | Press | Normal | Cover | Organised |
Serie A | Genoa | 702 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Serie A | Bologna | 696 | Balanced | Little | Mixed | Normal | Normal | Deep | Press | Normal | Cover | Organised |
Serie A | Atalanta | 690 | Balanced | Little | Mixed | Normal | Normal | Deep | Contain | Normal | Offside Trap | Organised |
Serie A | Cagliari | 684 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
%%sql
WITH wins AS (
SELECT l.name AS League, t.team_long_name AS Team, COUNT(w.who_won) AS 'Win',
a.buildUpPlaySpeedClass AS speed_class, a.buildUpPlayDribblingClass AS dribbling_class,
a.buildUpPlayPassingClass AS passing_class, a.chanceCreationPassingClass AS creation_passing_class,
a.chanceCreationShootingCLass AS creation_shooting_class, a.defencePressureClass AS pressure_class,
a.defenceAggressionClass AS agression_class, a.defenceTeamWidthClass AS team_width_class,
a.defenceDefenderLineClass AS defender_line_class, a.buildUpPlayPositioningClass AS possition_class
FROM win_loss w
JOIN best_leagues l ON l.id = w.league_id
JOIN best_teams t ON t.team_api_id = w.who_won
JOIN Team_Attributes a ON a.team_api_id = t.team_api_id
GROUP BY w.who_won
ORDER BY Win DESC
)
SELECT * FROM (SELECT * FROM wins WHERE League = 'Bundesliga' ORDER BY Win DESC LIMIT 5)
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'League 1' ORDER BY Win DESC LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'La Liga' ORDER BY Win DESC LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'Premier League' ORDER BY Win DESC LIMIT 5 )
UNION ALL
SELECT * FROM (SELECT * FROM wins WHERE League = 'Serie A' ORDER BY Win DESC LIMIT 5 )
* sqlite:///database.sqlite Done.
League | Team | Win | speed_class | dribbling_class | passing_class | creation_passing_class | creation_shooting_class | pressure_class | agression_class | team_width_class | defender_line_class | possition_class |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Bundesliga | FC Bayern Munich | 1158 | Balanced | Little | Mixed | Normal | Lots | High | Press | Normal | Cover | Free Form |
Bundesliga | Borussia Dortmund | 942 | Fast | Little | Mixed | Normal | Lots | Medium | Double | Normal | Cover | Organised |
Bundesliga | Bayer 04 Leverkusen | 822 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Bundesliga | FC Schalke 04 | 762 | Balanced | Little | Mixed | Normal | Lots | High | Press | Wide | Cover | Organised |
Bundesliga | VfL Wolfsburg | 702 | Balanced | Little | Mixed | Normal | Lots | Medium | Press | Normal | Cover | Organised |
League 1 | Paris Saint-Germain | 1050 | Balanced | Little | Mixed | Normal | Normal | High | Double | Wide | Offside Trap | Organised |
League 1 | Olympique Lyonnais | 918 | Balanced | Little | Mixed | Normal | Lots | Medium | Press | Normal | Cover | Organised |
League 1 | LOSC Lille | 882 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
League 1 | Olympique de Marseille | 858 | Balanced | Little | Mixed | Normal | Lots | Medium | Press | Normal | Cover | Organised |
League 1 | Girondins de Bordeaux | 756 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
La Liga | FC Barcelona | 1404 | Balanced | Little | Mixed | Normal | Lots | High | Press | Wide | Cover | Free Form |
La Liga | Real Madrid CF | 1368 | Balanced | Little | Mixed | Risky | Lots | High | Press | Normal | Cover | Free Form |
La Liga | Atlético Madrid | 1002 | Balanced | Little | Mixed | Normal | Lots | Medium | Press | Normal | Cover | Organised |
La Liga | Valencia CF | 852 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
La Liga | Sevilla FC | 834 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Premier League | Manchester United | 1152 | Balanced | Little | Mixed | Normal | Lots | Medium | Press | Normal | Cover | Organised |
Premier League | Chelsea | 1056 | Balanced | Little | Mixed | Normal | Lots | Medium | Press | Normal | Cover | Free Form |
Premier League | Manchester City | 1050 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Premier League | Arsenal | 1020 | Balanced | Little | Short | Safe | Normal | Deep | Press | Normal | Cover | Free Form |
Premier League | Tottenham Hotspur | 906 | Balanced | Little | Mixed | Risky | Lots | Deep | Press | Normal | Cover | Free Form |
Serie A | Juventus | 1134 | Balanced | Little | Mixed | Normal | Lots | Medium | Press | Wide | Offside Trap | Free Form |
Serie A | Roma | 972 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Offside Trap | Free Form |
Serie A | Inter | 924 | Balanced | Little | Mixed | Normal | Lots | Medium | Press | Normal | Cover | Free Form |
Serie A | Milan | 924 | Balanced | Little | Mixed | Normal | Normal | Medium | Press | Normal | Cover | Organised |
Serie A | Napoli | 918 | Balanced | Little | Mixed | Normal | Normal | Medium | Double | Normal | Cover | Free Form |
Surpringsly, there are not that many differences in the playstyle of the teams with more winds related to the teams with more losses. I will need to look at other parameters. However there are some differences to take into account:
Speed class
: Mostly balanced - Allows to defend, attack, and dominate the middle of the park effectively. It is mostly defined by the experience and quality of the players.
Dribbling class
: Mostly little - Creativy of players on 1 to 1 situations.
Creation Passing Class
: Mostly Normal - Ability and accuracy of the team passing the ball. Neither risky or safe, most like a combination of both. Risky class means dragging your players out of position, making the team sometimes susceptible to counterattacks; on the other hand, when playing against high pressure can be effective because of the space the team can exploit.
Pressure Class
: Mixing of Medium, High, and Deep - Define how the team pressures the other.
Agression Class
: Mostly Press - Together with pressure, can have the most impact on the game.
Team Width Class
: Mostly Normal and some wides. Defines how the team position itself on the pitch when defending. Nothing to do with how the team attacks.
Defender Line Class
: Mostly Cover
Creation Shooting Class
: Teams with more losses have mostly a category of Normal. Teams with more wins, somehow predictable, have mostly a category of Lots, meaning a higher class creation shooting class. This means, the players possition themselves more on the edge of the box to shoot. Shooting on low means the players run more into the box before shooting. Many times, this doesn't defines the team wins more, sometimes it is related to the playstyle of the team. For example, Barcelona uses to have a low shooting class and it is one of the teams with more wins.Passing class
: Teams with more losses tend to have Long passing class. However teams with more wins are more in the "Mixed" and "Short" categories. Teams with "Mixed" category tends to use of a combination of short and long passing class. On short passing, the plays come shorter to look for passes. On long class, they will stay away to start their run for the passes. The last one has some effect on the defensive position. When players stay further away from each other, they will create space for the opposition to dive into counters. Possition Class
: Even though both of them have predominance of "Organized" category, teams with more wins tend to have a "Free Form" category. "Organized" is a safer option for the team, the central defenders keep their position better, and the midfielders are defensively better positioned. However, on "Free Form" the teams are more creative; the central defenders are more dynamic and make more runs, but they, together with the midfielders, can get caught out of position easier, and they more prone to counterattacks, depending most on talent and team work for defending certain game situations.Note: Even though some differences and particularities were extracted to identify playstyle that can lead the teams to more victories, a different approach may be useful for better and more concised conclusions
In the following, I compare the different attributes that led to more victories in the different teams. Using a different approach than before. Previously, I visually compared the teams with more victories and their attirbutes, trying to find a relationship between victory and class of attributes. However, this approach can have a huge bias in some attributes when analyzing their relationship. Teams with more victories are usually teams with more economic power, in other words, where the best players are. So, sometimes, individual talent makes the difference in the games in order to obtain the victory, regardless of the type of strategy the team may take. There is other groups more similar regarding theirt economic power and quality of their players, in these teams their playstyle may have a bigger role when achieving victories. With the following approach, I am looking to reveal that relationship by grouping the different classes by category with the number of wins and loss teams had using these categories.
%%sql
SELECT a.buildUpPlaySpeedClass AS 'Speed Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.buildUpPlaySpeedClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Speed Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Fast | 15.0 | 12.0 | 1.28 |
Slow | 16.0 | 13.0 | 1.24 |
Balanced | 15.0 | 13.0 | 1.12 |
%%sql
SELECT a.buildUpPlayDribblingClass AS 'Dribbling Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.buildUpPlayDribblingClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Dribbling Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Lots | 17.0 | 11.0 | 1.62 |
Little | 15.0 | 13.0 | 1.15 |
Normal | 15.0 | 13.0 | 1.12 |
%%sql
SELECT a.buildUpPlayPassingClass AS 'Passing Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.buildUpPlayPassingClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Passing Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Short | 18.0 | 11.0 | 1.67 |
Mixed | 15.0 | 13.0 | 1.12 |
Long | 12.0 | 15.0 | 0.79 |
%%sql
SELECT a.buildUpPlayPositioningClass AS 'Positioning Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.buildUpPlayPositioningClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Positioning Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Free Form | 21.0 | 9.0 | 2.43 |
Organised | 14.0 | 13.0 | 1.07 |
%%sql
SELECT a.chanceCreationPassingClass AS 'Creation Passing Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.chanceCreationPassingClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Creation Passing Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Risky | 16.0 | 12.0 | 1.32 |
Normal | 15.0 | 13.0 | 1.12 |
Safe | 15.0 | 13.0 | 1.11 |
%%sql
SELECT a.chanceCreationCrossingClass AS 'Creation Crossing Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.chanceCreationCrossingClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Creation Crossing Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Little | 18.0 | 10.0 | 1.82 |
Normal | 15.0 | 13.0 | 1.14 |
Lots | 14.0 | 13.0 | 1.09 |
%%sql
SELECT a.chanceCreationShootingClass AS 'Creation Shooting Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.chanceCreationShootingClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Creation Shooting Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Lots | 18.0 | 11.0 | 1.64 |
Little | 16.0 | 12.0 | 1.33 |
Normal | 14.0 | 13.0 | 1.05 |
%%sql
SELECT a.chanceCreationPositioningClass AS 'Creation Positioning Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.chanceCreationPositioningClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Creation Positioning Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Free Form | 19.0 | 10.0 | 1.87 |
Organised | 14.0 | 14.0 | 1.02 |
%%sql
SELECT a.defencePressureClass AS 'Defence Pressure Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.defencePressureClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Defence Pressure Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
High | 17.0 | 11.0 | 1.45 |
Medium | 15.0 | 13.0 | 1.16 |
Deep | 13.0 | 14.0 | 0.9 |
%%sql
SELECT a.defenceAggressionClass AS 'Defence Aggression Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.defenceAggressionClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Defence Aggression Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Double | 15.0 | 12.0 | 1.27 |
Press | 15.0 | 13.0 | 1.14 |
Contain | 14.0 | 14.0 | 1.01 |
%%sql
SELECT a.defenceTeamWidthClass AS 'Defence Team Width Class', ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.defenceTeamWidthClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Defence Team Width Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Wide | 15.0 | 13.0 | 1.17 |
Normal | 15.0 | 13.0 | 1.16 |
Narrow | 12.0 | 15.0 | 0.78 |
%%sql
SELECT a.defenceDefenderLineClass AS 'Defence Line Class',ROUND(AVG(v.Wins)) AS 'Average of Wins',
ROUND(AVG(l.Losses)) AS "Average of Losses", ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.defenceDefenderLineClass
ORDER BY "Rate of Wins vs Losses" DESC;
* sqlite:///database.sqlite Done.
Defence Line Class | Average of Wins | Average of Losses | Rate of Wins vs Losses |
---|---|---|---|
Offside Trap | 16.0 | 12.0 | 1.34 |
Cover | 15.0 | 13.0 | 1.13 |
Play Speed Class
: Teams with "Fast" Category achieved the greatest win over loss rate. Surpringsly, Teams with "Low" Category obtained higher rate than those with "Balanced" category. This fact contradicts the previous analysis that reflected that the among the 25 teams with more victory (top 5 for each 5 leagues), 24 had a "Balanced" Category and only 1 had "Fast Category". So, for the most powerful teams (economically speaking, as I discussed in the introduction of this section) the speed of Play had not an impact in their victories as for other teams may had.Dribbling Class
: The rate for "Lots" Category as expected had the highest rate of wins over losses.In other words, those teams with players with high creativity on 1 to 1, had a higher probability to win. Even though, this category doesn't seem to have a big impact on achieving victory.Passing Class
: "Short" Category had the highest rate of wins vs lose, with almost the double of rate that "Long" category, which had the lowest. Positioning Class
: "Free form" has a considerably higher rate compared with "Organised". Reaffriming the previously discussed, some of the teams with more wins practice a "Free Form" position class.Creation Passing Class
: This category doesn't seem to have such an impact in wins and losses for the team. Creation Crossing Class
: 'Little' Category had the highest rate of win vs losses compared with the other. This usually means that teams with little creation crossing rely less in long passes from one side of the field to the other. Usually, teams with this type of tactics rely on strategy and the latent of their midfielders, to create and shape their game while moving forward to their rival.Creation Shooting Class
: As anyone could expect, 'Lots' category has the highes rate and probability of winning. This means that teams will achieve victory easier when they shoot more. Defence Pressure Class
: 'High' pressure has the highest rate.Defence Aggression Class
: Doesn't seem to have a high impact in victory for itself. Those teams with "Contain" Category seems to have the lowest chance of victory.Defence Team Width Class
: Those with a "Narrow" Category seems to have the lowest probability of achieving victories. Doesn't seem to have a high impact in victory for itself.Defence Line Class
: Neither of those categories seems to have a impact on victory by itself. Offside Trap is slightly higher.Final Notes:
With this analysis, I confirm that attributes such Passing Class
, and Possition Class
have an impact by itself in the possibility of a team to obtain victory. Free Form Possition Class had a rate over 2 of wins over losses, meaning you have twice the probability of winning than lossing using this playstyle. In addition, Dribbling
and Crossing
attributes also has an impact in this probability.
Passing Class
and Dribbling Class
attributes are categories that mainly depend on the talent of the players, so there could be the issue that players quality lead to the type of playstyle use under this category. On the other side, Possition Class
and Crossing Class
may depend more on the strategty of the team work. These types of strategies are usually resulting of training and team play style.
Image taken from hindustatimes
In this section, I rank the combination of attributes with more relevance for achieving victory, to identify the combination with the highest probability of victory.
%%sql
SELECT a.buildUpPlaySpeedClass AS 'Speed Class', a.buildUpPlayDribblingClass AS 'Dribbling Class',
a.buildUpPlayPassingClass AS 'Passing Class', a.buildUpPlayPositioningClass AS 'Positioning Class',
a.chanceCreationShootingClass AS 'Creation Shooting Class', a.defenceTeamWidthClass AS 'Defence Team Width Class',
ROUND(AVG(v.Wins)) AS 'Average of Wins', ROUND(AVG(l.Losses)) AS "Average of Losses",
ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses", RANK() OVER(ORDER BY ROUND(AVG(v.Wins)/AVG(l.Losses),2) DESC) AS "Ranking"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.buildUpPlaySpeedClass, a.buildUpPlayDribblingClass, a.buildUpPlayPassingClass, a.buildUpPlayPositioningClass,
a.chanceCreationShootingClass, a.defenceTeamWidthClass
ORDER BY "Rate of Wins vs Losses" DESC
LIMIT 10;
* sqlite:///database.sqlite Done.
Speed Class | Dribbling Class | Passing Class | Positioning Class | Creation Shooting Class | Defence Team Width Class | Average of Wins | Average of Losses | Rate of Wins vs Losses | Ranking |
---|---|---|---|---|---|---|---|---|---|
Balanced | Normal | Short | Free Form | Normal | Normal | 27.0 | 4.0 | 6.89 | 1 |
Slow | Little | Short | Free Form | Normal | Normal | 28.0 | 4.0 | 6.53 | 2 |
Balanced | Little | Mixed | Free Form | Lots | Wide | 27.0 | 5.0 | 5.55 | 3 |
Balanced | Little | Short | Free Form | Little | Normal | 24.0 | 4.0 | 5.51 | 4 |
Balanced | Little | Short | Free Form | Normal | Normal | 25.0 | 6.0 | 4.59 | 5 |
Balanced | Normal | Mixed | Free Form | Normal | Normal | 25.0 | 6.0 | 4.01 | 6 |
Slow | Normal | Short | Organised | Lots | Normal | 24.0 | 7.0 | 3.6 | 7 |
Balanced | Normal | Short | Organised | Lots | Normal | 23.0 | 7.0 | 3.43 | 8 |
Balanced | Little | Short | Free Form | Lots | Wide | 24.0 | 7.0 | 3.33 | 9 |
Fast | Little | Mixed | Free Form | Lots | Normal | 22.0 | 7.0 | 3.03 | 10 |
%%sql
SELECT a.buildUpPlaySpeedClass AS 'Speed Class', a.buildUpPlayDribblingClass AS 'Dribbling Class',
a.buildUpPlayPassingClass AS 'Passing Class', a.buildUpPlayPositioningClass AS 'Positioning Class',
a.chanceCreationPassingClass AS 'Creation Passing Class', a.chanceCreationCrossingClass AS 'Creation Crossing Class',
a.chanceCreationShootingClass AS 'Creation Shooting Class', a.defencePressureClass AS 'Defence Pressure',
a.defenceAggressionClass AS 'Defence Aggression Class', a.defenceTeamWidthClass AS 'Defence Team Width Class',
a.defenceDefenderLineClass AS 'Defender Line Class',
ROUND(AVG(v.Wins)) AS 'Average of Wins', ROUND(AVG(l.Losses)) AS "Average of Losses",
ROUND(AVG(v.Wins)/AVG(l.Losses),2) AS "Rate of Wins vs Losses",
RANK() OVER(ORDER BY ROUND(AVG(v.Wins)/AVG(l.Losses),2) DESC) AS "Ranking"
FROM Team_Attributes a
INNER JOIN victories_ranking v ON v.team_api_id = a.team_api_id
INNER JOIN losses_ranking l ON l.team_api_id = a.team_api_id
GROUP BY a.buildUpPlaySpeedClass, a.buildUpPlayDribblingClass, a.buildUpPlayPassingClass, a.buildUpPlayPositioningClass,
a.chanceCreationPassingClass, a.chanceCreationCrossingClass, a.chanceCreationShootingClass,
a.defencePressure, a.defenceAggressionClass, a.defenceTeamWidthClass, a.defenceDefenderLineClass
ORDER BY "Rate of Wins vs Losses" DESC
LIMIT 10;
* sqlite:///database.sqlite Done.
Speed Class | Dribbling Class | Passing Class | Positioning Class | Creation Passing Class | Creation Crossing Class | Creation Shooting Class | Defence Pressure | Defence Aggression Class | Defence Team Width Class | Defender Line Class | Average of Wins | Average of Losses | Rate of Wins vs Losses | Ranking |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Balanced | Little | Mixed | Free Form | Normal | Little | Lots | High | Press | Wide | Cover | 29.0 | 3.0 | 8.67 | 1 |
Balanced | Little | Short | Free Form | Normal | Little | Normal | Medium | Press | Normal | Cover | 29.0 | 3.0 | 8.67 | 1 |
Balanced | Little | Short | Free Form | Normal | Normal | Lots | High | Contain | Wide | Offside Trap | 29.0 | 3.0 | 8.67 | 1 |
Balanced | Normal | Mixed | Free Form | Normal | Normal | Normal | Medium | Press | Normal | Cover | 29.0 | 3.0 | 8.67 | 1 |
Balanced | Normal | Short | Free Form | Normal | Little | Normal | Medium | Press | Normal | Cover | 29.0 | 3.0 | 8.67 | 1 |
Slow | Little | Short | Free Form | Normal | Little | Normal | Medium | Press | Normal | Cover | 29.0 | 3.0 | 8.67 | 1 |
Balanced | Little | Mixed | Free Form | Risky | Normal | Lots | Medium | Press | Wide | Cover | 29.0 | 5.0 | 5.7 | 7 |
Balanced | Little | Mixed | Free Form | Risky | Normal | Lots | High | Press | Normal | Cover | 29.0 | 5.0 | 5.7 | 7 |
Balanced | Little | Mixed | Organised | Risky | Normal | Lots | Medium | Press | Normal | Cover | 29.0 | 5.0 | 5.7 | 7 |
Balanced | Little | Short | Free Form | Risky | Normal | Lots | Medium | Press | Normal | Offside Trap | 29.0 | 5.0 | 5.7 | 7 |
%%sql
SELECT * FROM victories_ranking limit 10;
* sqlite:///database.sqlite Done.
season | team_api_id | League | Team | Wins | ranking |
---|---|---|---|---|---|
2008/2009 | 10260 | Premier League | Manchester United | 28 | 1 |
2008/2009 | 8634 | La Liga | FC Barcelona | 27 | 2 |
2008/2009 | 8650 | Premier League | Liverpool | 25 | 3 |
2008/2009 | 8636 | Serie A | Inter | 25 | 3 |
2008/2009 | 8633 | La Liga | Real Madrid CF | 25 | 3 |
2008/2009 | 8455 | Premier League | Chelsea | 25 | 3 |
2008/2009 | 9827 | League 1 | Girondins de Bordeaux | 24 | 7 |
2008/2009 | 8592 | League 1 | Olympique de Marseille | 22 | 8 |
2008/2009 | 8564 | Serie A | Milan | 22 | 8 |
2008/2009 | 9885 | Serie A | Juventus | 21 | 10 |
%%sql
SELECT v.* FROM victories_ranking v
INNER JOIN Team_Attributes a ON a.team_api_id = v.team_api_id
WHERE a.buildUpPlaySpeedClass = 'Balanced' AND a.buildUpPlayDribblingClass = 'Normal' AND
a.buildUpPlayPassingClass = 'Short' AND (a.buildUpPlayPositioningClass = 'Free Form' OR a.buildUpPlayPositioningClass = 'Organised') AND
(a.chanceCreationShootingClass = 'Little' OR a.chanceCreationShootingClass = 'Normal') AND a.defenceTeamWidthClass = 'Normal'
GROUP BY v.season, v.team_api_id;
* sqlite:///database.sqlite Done.
season | team_api_id | League | Team | Wins | ranking |
---|---|---|---|---|---|
2008/2009 | 8456 | Premier League | Manchester City | 15 | 34 |
2008/2009 | 8535 | Serie A | Fiorentina | 21 | 10 |
2008/2009 | 8634 | La Liga | FC Barcelona | 27 | 2 |
2008/2009 | 9823 | Bundesliga | FC Bayern Munich | 20 | 14 |
2008/2009 | 9825 | Premier League | Arsenal | 20 | 14 |
2009/2010 | 8456 | Premier League | Manchester City | 18 | 21 |
2009/2010 | 8535 | Serie A | Fiorentina | 13 | 43 |
2009/2010 | 8634 | La Liga | FC Barcelona | 31 | 1 |
2009/2010 | 9823 | Bundesliga | FC Bayern Munich | 20 | 12 |
2009/2010 | 9825 | Premier League | Arsenal | 23 | 7 |
2009/2010 | 10167 | Serie A | Parma | 14 | 38 |
2010/2011 | 8456 | Premier League | Manchester City | 21 | 7 |
2010/2011 | 8535 | Serie A | Fiorentina | 12 | 46 |
2010/2011 | 8634 | La Liga | FC Barcelona | 30 | 1 |
2010/2011 | 9823 | Bundesliga | FC Bayern Munich | 19 | 15 |
2010/2011 | 9825 | Premier League | Arsenal | 19 | 15 |
2010/2011 | 10167 | Serie A | Parma | 11 | 64 |
2011/2012 | 8456 | Premier League | Manchester City | 28 | 2 |
2011/2012 | 8535 | Serie A | Fiorentina | 10 | 69 |
2011/2012 | 8634 | La Liga | FC Barcelona | 28 | 2 |
2011/2012 | 9823 | Bundesliga | FC Bayern Munich | 23 | 7 |
2011/2012 | 9825 | Premier League | Arsenal | 21 | 11 |
2011/2012 | 10167 | Serie A | Parma | 15 | 30 |
2012/2013 | 8456 | Premier League | Manchester City | 23 | 7 |
2012/2013 | 8535 | Serie A | Fiorentina | 21 | 11 |
2012/2013 | 8634 | La Liga | FC Barcelona | 32 | 1 |
2012/2013 | 9823 | Bundesliga | FC Bayern Munich | 29 | 2 |
2012/2013 | 9825 | Premier League | Arsenal | 21 | 11 |
2012/2013 | 10167 | Serie A | Parma | 13 | 42 |
2013/2014 | 8456 | Premier League | Manchester City | 27 | 4 |
2013/2014 | 8535 | Serie A | Fiorentina | 19 | 20 |
2013/2014 | 8634 | La Liga | FC Barcelona | 27 | 4 |
2013/2014 | 9823 | Bundesliga | FC Bayern Munich | 29 | 2 |
2013/2014 | 9825 | Premier League | Arsenal | 24 | 11 |
2013/2014 | 10167 | Serie A | Parma | 15 | 34 |
2014/2015 | 8456 | Premier League | Manchester City | 24 | 6 |
2014/2015 | 8535 | Serie A | Fiorentina | 18 | 22 |
2014/2015 | 8634 | La Liga | FC Barcelona | 30 | 1 |
2014/2015 | 9823 | Bundesliga | FC Bayern Munich | 25 | 5 |
2014/2015 | 9825 | Premier League | Arsenal | 22 | 10 |
2014/2015 | 10167 | Serie A | Parma | 6 | 96 |
2015/2016 | 8456 | Premier League | Manchester City | 19 | 13 |
2015/2016 | 8535 | Serie A | Fiorentina | 18 | 17 |
2015/2016 | 8634 | La Liga | FC Barcelona | 29 | 2 |
2015/2016 | 9823 | Bundesliga | FC Bayern Munich | 28 | 4 |
2015/2016 | 9825 | Premier League | Arsenal | 20 | 11 |
%%sql
SELECT COUNT(*) FROM best_teams;
* sqlite:///database.sqlite Done.
COUNT(*) |
---|
14585 |
Image taken from diario lateral
%%sql
DROP VIEW IF EXISTS player_ranking;
CREATE VIEW player_ranking AS
SELECT p.player_api_id, p.player_name AS 'Player', ROUND(AVG(a.overall_rating),2) AS 'Rating',
RANK() OVER(ORDER BY ROUND(AVG(a.overall_rating),2) DESC) AS 'Ranking'
FROM Player p
INNER JOIN Player_attributes a ON p.player_api_id = a.player_api_id
GROUP BY p.player_name
ORDER BY ROUND(AVG(a.overall_rating),2) DESC;
* sqlite:///database.sqlite Done. Done.
[]
%%sql
SELECT * FROM player_ranking LIMIT 10;
* sqlite:///database.sqlite Done.
player_api_id | Player | Rating | Ranking |
---|---|---|---|
30981 | Lionel Messi | 92.19 | 1 |
30893 | Cristiano Ronaldo | 91.28 | 2 |
30924 | Franck Ribery | 88.46 | 3 |
30955 | Andres Iniesta | 88.32 | 4 |
35724 | Zlatan Ibrahimovic | 88.29 | 5 |
30834 | Arjen Robben | 87.84 | 6 |
39854 | Xavi Hernandez | 87.64 | 7 |
30829 | Wayne Rooney | 87.22 | 8 |
30657 | Iker Casillas | 86.95 | 9 |
30894 | Philipp Lahm | 86.73 | 10 |
%%sql
SELECT 'Player with higher potential' AS 'Attribute', r.Player AS 'Player Name',
r.Ranking AS 'Ranking of Rating', a.potential AS 'Value'
FROM player_ranking r
JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
WHERE a.potential = (SELECT MAX(potential) FROM Player_Attributes)
GROUP BY r.Player
UNION
SELECT 'Player with best heading_accuracy' AS 'Attribute', r.Player AS 'Player Name',
r.Ranking AS 'Ranking of Rating', a.heading_accuracy AS 'Value'
FROM player_ranking r
JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
WHERE a.heading_accuracy = (SELECT MAX(heading_accuracy) FROM Player_Attributes)
GROUP BY r.Player
UNION
SELECT 'Player with best dribbling' AS 'Attribute', r.Player AS 'Player Name',
r.Ranking AS 'Ranking of Rating', a.dribbling AS 'Value'
FROM player_ranking r
JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
WHERE a.dribbling = (SELECT MAX(dribbling) FROM Player_Attributes)
GROUP BY r.Player
UNION
SELECT 'Player with best ball control' AS 'Attribute', r.Player AS 'Player Name',
r.Ranking AS 'Ranking of Rating', a.ball_control AS 'Value'
FROM player_ranking r
JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
WHERE a.ball_control = (SELECT MAX(ball_control) FROM Player_Attributes)
GROUP BY r.Player
UNION
SELECT 'Player with higher free kick accuracy' AS 'Attribute', r.Player AS 'Player Name',
r.Ranking AS 'Ranking of Rating', a.free_kick_accuracy AS 'Value'
FROM player_ranking r
JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
WHERE a.free_kick_accuracy = (SELECT MAX(free_kick_accuracy) FROM Player_Attributes)
GROUP BY r.Player
UNION
SELECT 'Player with more interceptions' AS 'Attribute', r.Player AS 'Player Name',
r.Ranking AS 'Ranking of Rating', a.interceptions AS 'Value'
FROM player_ranking r
JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
WHERE a.interceptions = (SELECT MAX(interceptions) FROM Player_Attributes)
GROUP BY r.Player
UNION
SELECT 'Player with more penalties' AS 'Attribute', r.Player AS 'Player Name',
r.Ranking AS 'Ranking of Rating', a.penalties AS 'Value'
FROM player_ranking r
JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
WHERE a.penalties = (SELECT MAX(penalties) FROM Player_Attributes)
GROUP BY r.Player
UNION
SELECT 'Player with more sliding tackles' AS 'Attribute', r.Player AS 'Player Name',
r.Ranking AS 'Ranking of Rating', a.sliding_tackle AS 'Value'
FROM player_ranking r
JOIN Player_Attributes a ON a.player_api_id = r.player_api_id
WHERE a.sliding_tackle = (SELECT MAX(sliding_tackle) FROM Player_Attributes)
GROUP BY r.Player
* sqlite:///database.sqlite Done.
Attribute | Player Name | Ranking of Rating | Value |
---|---|---|---|
Player with best ball control | Lionel Messi | 1 | 97 |
Player with best ball control | Ronaldinho | 70 | 97 |
Player with best dribbling | Cristiano Ronaldo | 2 | 97 |
Player with best dribbling | Lionel Messi | 1 | 97 |
Player with best dribbling | Ronaldinho | 70 | 97 |
Player with best heading_accuracy | Nikola Zigic | 2911 | 98 |
Player with higher free kick accuracy | Juninho Pernambucano,20 | 429 | 97 |
Player with higher potential | Lionel Messi | 1 | 97 |
Player with more interceptions | Andrea Pirlo | 29 | 96 |
Player with more interceptions | Timmy Simons | 1132 | 96 |
Player with more penalties | Rickie Lambert | 1895 | 96 |
Player with more sliding tackles | Philipp Lahm | 10 | 95 |
Similar to teams attributes, some attributes influence more in the rating of the player than others. At the same time, offensive players tend to get higher ratings than other positions due to the influence they have in the victory of their teams. Other features, also depend on the style of game of the team.
Due to the limitations of SQLite, a more thoroughly analysis is difficult. A similar analysis to the teams section is repetitive in the procedure and will require too time consuming queries due to the schema of this particular database.
Through the analysis of the football database I have used some sqlite functions and techniques such as Views, Window functions, nested queries, common table expresions among others. The results have been analyzed at the end of each section. Next, it is a quick summary.