Kaggle賽題-“瘋狂的三月”NACC籃球賽預(yù)測(2)

上一篇對(duì)本賽題的基礎(chǔ)數(shù)據(jù)做了一些簡單地了解郑诺,并做了一些探索性可視化分析。

本節(jié)繼續(xù)對(duì)比賽的詳細(xì)數(shù)據(jù)進(jìn)行探索分析蝙叛,得到一些更棒更有趣的結(jié)論襟诸。

Game detail infomation visualization(比賽詳細(xì)統(tǒng)計(jì)數(shù)據(jù)可視化)

  • WFGM:投籃命中數(shù)享潜。
  • WFGA:未命中數(shù)困鸥。
  • WFGM3:3分命中數(shù),注意這部分?jǐn)?shù)據(jù)是包含在WFGM中的剑按。
  • WFGA3:3分未命中數(shù)疾就。
  • WFTM:罰球命中數(shù)。
  • WFTA:罰球未命中數(shù)艺蝴。
  • WOR:進(jìn)攻籃板猬腰。
  • WDR:防守籃板。
  • WAst:助攻數(shù)猜敢。
  • WTO:失誤數(shù)姑荷。
  • WStl:搶斷數(shù)。
  • WBlk:蓋帽數(shù)缩擂。
  • WPF:個(gè)人犯規(guī)數(shù)鼠冕。

分別對(duì)常規(guī)賽、NACC賽(除決賽)撇叁、NACC決賽的以上數(shù)據(jù)進(jìn)行均值統(tǒng)計(jì) 并進(jìn)行可視化

cols = ['TeamID','Score','FGM','FGA','FGM3','FGA3','FTM','FTA','OR','DR','Ast','TO','Stl','Blk','PF']
cols = ['W'+col for col in cols] + ['L'+col for col in cols]

regular_data = section1_detail.query('is_regular == 1')[cols].mean().rename('value')
tournament_data = section1_detail.query('is_regular == 0 and DayNum < 154')[cols].mean().rename('value')
final_data = section1_detail.query('DayNum == 154')[cols].mean().rename('value')

tmp = pd.DataFrame({})
tmp = tmp.append([['regular']+list(regular_data.values)],ignore_index=True)
tmp = tmp.append([['tournament']+list(tournament_data.values)],ignore_index=True)
tmp = tmp.append([['final']+list(final_data.values)],ignore_index=True)
tmp.columns = ['match_type']+list(regular_data.index)
tmp

plt.subplots(figsize=(20, 3*5))
plt.subplots_adjust(wspace=0.2, hspace=0.4)

for i,col in enumerate(['FGM','FGA','FGM3','FGA3','FTM','FTA','OR','DR','Ast','TO','Stl','Blk','PF']):
#     plt.subplot(5,3,i*3+1)
#     sns.barplot(x="match_type", y='W'+col, data=tmp)
#     plt.subplot(5,3,i*3+2)
#     sns.barplot(x="match_type", y='L'+col, data=tmp)
#     plt.subplot(5,3,i*3+3)
#     tmp['W'+col+'_L'+col] = tmp['W'+col] - tmp['L'+col]
#     sns.barplot(x="match_type", y='W'+col+'_L'+col, data=tmp)
    plt.subplot(5,3,i+1)
    tmp['W'+col+'_L'+col] = tmp['W'+col] - tmp['L'+col]
    sns.barplot(x="match_type", y='W'+col+'_L'+col, data=tmp)
  • FGM&FGA: 命中數(shù)上看,決賽中球隊(duì)的差異最小畦贸,未命中上看陨闹,決賽中,勝利的一方似乎出手更穩(wěn)重薄坏,也就是投丟了更少的球
  • FGM3&FGA3: 三分球上看趋厉,在決賽中,勝利的一方甚至進(jìn)的三分球要少于失敗的一方胶坠,這也說明一個(gè)現(xiàn)象君账,通常來說決賽因?yàn)榉朗胤绞健⒋盗P方式沈善、球員心態(tài)等各種因素乡数,一般球隊(duì)會(huì)采取更保守的得分手段椭蹄,而較少采用不穩(wěn)定的三分球。
  • FTM&FTA: 罰球上看差異不大净赴,勝利方的罰球命中率更高
  • Rebounds: 籃板球:灌籃高手里面說绳矩,贏得籃板的人贏得比賽,看看是不是這樣玖翅∫砉荩可以看到對(duì)于進(jìn)攻籃板,勝利的球隊(duì)總是更少的金度,決賽尤其如此应媚,通常來說進(jìn)攻籃板少可以說明球隊(duì)的戰(zhàn)術(shù)偏保守,也就是積極退防猜极,以避免因?yàn)閾屵M(jìn)攻籃板導(dǎo)致對(duì)方出現(xiàn)快攻的機(jī)會(huì)中姜。而防守籃板上看,勝利方明顯表現(xiàn)更好
  • 助攻和失誤:在決賽魔吐,這兩個(gè)相對(duì)都更少扎筒,這是因?yàn)橥ǔQ賽由于防守強(qiáng)度、判罰尺度等問題酬姆,球隊(duì)通常需要采取更多的個(gè)人進(jìn)攻嗜桌,這也是明星球員的作用,而這種方式通常對(duì)應(yīng)的助攻和失誤都會(huì)少一些
  • 搶斷和蓋帽:這是兩種不同的防守方式辞色,通常搶斷是有一定冒險(xiǎn)成分的骨宠,因?yàn)閾寯嗍⊥ǔR馕吨唬虼藳Q賽中搶斷更少出現(xiàn)相满,而蓋帽多則表示到油漆區(qū)的進(jìn)攻更多层亿,這也是決賽中更多的進(jìn)攻方式選擇導(dǎo)致的
  • 個(gè)人犯規(guī)上,決賽中勝利方和失敗方相比差異更小立美,一定程度上說也是因?yàn)椴幌氚驯荣惖慕Y(jié)果交給裁判匿又,而是更多的給球員發(fā)揮,當(dāng)然這對(duì)一些喜歡造犯規(guī)的球員來說就不是很友好

Event Data

Each MEvents & WEvents file lists the play-by-play event logs for more than 99.5% of games from that season. Each event is assigned to either a team or a single one of the team's players. Thus if a basket is made by one player and an assist is credited to a second player, that would show up as two separate records. The players are listed by PlayerID within the xPlayers.csv file.

Mens Event Files:

  • MEvents2015.csv, MEvents2016.csv, MEvent2017.csv, MEvents2018.csv, MEvents2019.csv
    Womens Event Files:

  • WEvents2015.csv, WEvents2016.csv, WEvents2017.csv, WEvents2018.csv, WEvents2019.csv

We can read in all files and combine into one huge dataframe, one for womens and one for mens.

  • EventID - this is a unique ID for each logged event. The EventID's are different within each year and uniquely identify each play-by-play event. They ought to be listed in chronological order for the events within their game.

  • Season, DayNum, WTeamID, LTeamID - these four columns are sufficient to uniquely identify each game. The games are a mix of Regular Season, NCAA? Tourney, and Secondary Tourney games.

  • WFinalScore, LFinalScore

  • WCurrentScore, LCurrentScore

  • ElapsedSeconds - 這是從比賽開始到事件發(fā)生所經(jīng)過的秒數(shù)建蹄。(this is the number of seconds that have elapsed from the start of the game until the event occurred. With a 20-minute half, that means that an ElapsedSeconds value from 0 to 1200 represents an event in the first half, a value from 1200 to 2400 represents an event in the second half, and a value above 2400 represents an event in overtime. For example, since overtime periods are five minutes long (that's 300 seconds), a value of 2699 would represent one second left in the first overtime.)

  • EventTeamID - this is the ID of the team that the event is logged for, which will either be the WTeamID or the LTeamID.

  • EventPlayerID - this is the ID of the player that the event is logged for, as described in the MPlayers.csv file.

  • EventType, EventSubType - these indicate the type of the event that was logged (see listing below).

  • assist - 助攻

  • block - 蓋帽

  • steal - 搶斷

  • sub - 換人

  • timeout - 超時(shí): unk=unknown type of timeout; comm=commercial timeout; full=full timeout; short= short timeout

  • turnover -失誤: unk=unknown type of turnover; 10sec=10 second violation; 3sec=3 second violation; 5sec=5 second violation; bpass=bad pass turnover; dribb=dribbling turnover; lanev=lane violation; lostb=lost ball; offen=offensive turnover (?); offgt=offensive goaltending; other=other type of turnover; shotc=shot clock violation; trav=travelling

  • foul - 犯規(guī): unk=unknown type of foul; admT=administrative technical; benT=bench technical; coaT=coach technical; off=offensive foul; pers=personal foul; tech=technical foul

  • fouled 被犯規(guī)

  • reb - 籃板: deadb=a deadball rebound; def=a defensive rebound; defdb=a defensive deadball rebound; off=an offensive rebound; offdb=an offensive deadball rebound

  • made1, miss1 - a one-point free throw was made or missed, with one of the following subtypes: 1of1=the only free throw of the trip to the line; 1of2=the first of two free throw attempts; 2of2=the second of two free throw attempts; 1of3=the first of three free throw attempts; 2of3=the second of three free throw attempts; 3of3=the third of three free throw attempts; unk=unknown what the free throw sequence is

  • made2, miss2 - a two-point field goal was made or missed, with one of the following subtypes: unk=unknown type of two-point shot; dunk=dunk; lay=layup; tip=tip-in; jump=jump shot; alley=alley-oop; drive=driving layup; hook=hook shot; stepb=step-back jump shot; pullu=pull-up jump shot; turna=turn-around jump shot; wrong=wrong basket

  • made3, miss3 - a three-point field goal was made or missed, with one of the following subtypes: unk=unknown type of three-point shot; jump=jump shot; stepb=step-back jump shot; pullu=pull-up jump shot; turna=turn-around jump shot; wrong=wrong basket

  • jumpb 跳球: start=start period; block=block tie-up; heldb=held ball; lodge=lodged ball; lost=jump ball lost; outof=out of bounds; outrb=out of bounds rebound; won=jump ball won

讀取Event Data數(shù)據(jù)

mens_events = []
for year in [2015, 2016, 2017, 2018, 2019]:
    mens_events.append(pd.read_csv(Mfolder_path+f'MEvents{year}.csv'))
MEvents = pd.concat(mens_events)
print(MEvents.shape)
MEvents.head()
(13149684, 17)

womens_events = []
for year in [2015, 2016, 2017, 2018, 2019]:
    womens_events.append(pd.read_csv(Wfolder_path+f'WEvents{year}.csv'))
WEvents = pd.concat(womens_events)
print(WEvents.shape)
WEvents.head()
(12744264, 17)

del mens_events
del womens_events
gc.collect()

對(duì)EventType進(jìn)行統(tǒng)計(jì)并可視化

EventType = pd.DataFrame({'MEvents' : MEvents['EventType'].value_counts(),'WEvents': WEvents['EventType'].value_counts()})
EventType = EventType.sort_values('MEvents').reset_index()
#EventType.sort_values('MEvents')   這里可視化更好一些

plt.figure(figsize=(15,6))
plt.subplots(figsize=(20, 10))

plt.subplot(2,1,1)
sns.barplot(x ='index', y = 'MEvents', data = EventType)

plt.subplot(2,1,2)
sns.barplot(x ='index', y = 'WEvents', data = EventType)

籃球場上換人發(fā)生是最多的碌更,其次才是籃板球。甚至犯規(guī)次數(shù)都要比投進(jìn)和投失兩分球的次數(shù)還要多洞慎。

  • 值得注意的是:
    該數(shù)據(jù)還給定了XY坐標(biāo)和坐標(biāo)對(duì)應(yīng)的區(qū)域名稱痛单,球場左下角為(0,0),右上角為(100,100)劲腿,中心為(50,50)旭绒,將這個(gè)坐標(biāo)和標(biāo)準(zhǔn)NACC球場進(jìn)行尺度統(tǒng)一,再結(jié)合數(shù)據(jù)點(diǎn)將事件可視化將會(huì)更加直觀。

    • X, Y - for games where it is available, this describes an X/Y position on the court where the lower-left corner of the full court is (0,0), the upper-right corner of the full court is (100,100), the exact middle of the full court (where the initial jump ball happens) is (50,50), and so on. The X/Y position is provided for fouls, turnovers, and field-goal attempts (either 2-point or 3-point).
    • Area - for events where an X/Y position is provided, this position is more generally categorized into one of 13 "areas" of the court, as follows: 1=under basket; 2=in the paint; 3=inside right wing; 4=inside right; 5=inside center; 6=inside left; 7=inside left wing; 8=outside right wing; 9=outside right; 10=outside center; 11=outside left; 12=outside left wing; 13=backcourt

坐標(biāo)示意圖


區(qū)域示意圖


參考鏈接
參考鏈接的參考鏈接

給定區(qū)域?yàn)閿?shù)字挥吵,構(gòu)建一個(gè)映射利用map得到區(qū)域的名稱 并分組進(jìn)行可視化化

#MEvents['Area'].value_counts()
area_mapping = {0: np.nan,
                1: 'under basket',
                2: 'in the paint',
                3: 'inside right wing',
                4: 'inside right',
                5: 'inside center',
                6: 'inside left',
                7: 'inside left wing',
                8: 'outside right wing',
                9: 'outside right',
                10: 'outside center',
                11: 'outside left',
                12: 'outside left wing',
                13: 'backcourt'}
MEvents['Area_Name'] = MEvents['Area'].map(area_mapping)

fig, ax = plt.subplots(figsize=(15, 8))
MEvents_X_Y = MEvents.loc[~MEvents['Area_Name'].isna()].groupby('Area_Name')
for i, d in MEvents_X_Y:
    sns.scatterplot(x='X', y='Y', data = d,  label=i, alpha = 0.3)
    #將圖例放在圖外
    plt.legend(loc=[1, 0])
#plt.legend(bbox_to_anchor=(1.04,1), loc="upper left")
ax.set_xticks([])
ax.set_yticks([])
ax.set_xlabel('')
ax.set_ylabel('')
ax.set_xlim(0, 100)
ax.set_ylim(0, 100)
plt.show()

球場的可視化
剛開始看到這幅圖的時(shí)候完全被經(jīng)驗(yàn)到了重父,大概瀏覽了一下代碼之后發(fā)現(xiàn):就這,這其實(shí)就是一個(gè)簡單的對(duì)坐標(biāo)進(jìn)行了統(tǒng)一然后利用簡單幾何圖形的疊加蔫劣。但其中的坐標(biāo)轉(zhuǎn)化其實(shí)也沒那么簡單坪郭,而且配色啊什么的也是一門學(xué)問,下面貼出代碼脉幢,如果后邊有時(shí)間的話歪沃,會(huì)對(duì)這個(gè)代碼進(jìn)行一個(gè)分解,將會(huì)學(xué)到更多關(guān)于python-matplotlib繪圖的奧秘嫌松。


def create_ncaa_full_court(ax=None, three_line='mens', court_color='#dfbb85',
                           lw=3, lines_color='black', lines_alpha=0.5,
                           paint_fill='blue', paint_alpha=0.4,
                           inner_arc=False):
    """
    Creates NCAA Basketball Court
    Dimensions are in feet (Court is 97x50 ft)
    Created by: Rob Mulla / https://github.com/RobMulla

    * Note that this function uses "feet" as the unit of measure.
    * NCAA Data is provided on a x range: 0, 100 and y-range 0 to 100
    * To plot X/Y positions first convert to feet like this:
    ```
    Events['X_'] = (Events['X'] * (94/100))
    Events['Y_'] = (Events['Y'] * (50/100))
    ```
    
    ax: matplotlib axes if None gets current axes using `plt.gca`


    three_line: 'mens', 'womens' or 'both' defines 3 point line plotted
    court_color : (hex) Color of the court
    lw : line width
    lines_color : Color of the lines
    lines_alpha : transparency of lines
    paint_fill : Color inside the paint
    paint_alpha : transparency of the "paint"
    inner_arc : paint the dotted inner arc
    """
    if ax is None:
        ax = plt.gca()

    # Create Pathes for Court Lines
    center_circle = Circle((94/2, 50/2), 6,
                           linewidth=lw, color=lines_color, lw=lw,
                           fill=False, alpha=lines_alpha)
    hoop_left = Circle((5.25, 50/2), 1.5 / 2,
                       linewidth=lw, color=lines_color, lw=lw,
                       fill=False, alpha=lines_alpha)
    hoop_right = Circle((94-5.25, 50/2), 1.5 / 2,
                        linewidth=lw, color=lines_color, lw=lw,
                        fill=False, alpha=lines_alpha)

    # Paint - 18 Feet 10 inches which converts to 18.833333 feet - gross!
    left_paint = Rectangle((0, (50/2)-6), 18.833333, 12,
                           fill=paint_fill, alpha=paint_alpha,
                           lw=lw, edgecolor=None)
    right_paint = Rectangle((94-18.83333, (50/2)-6), 18.833333,
                            12, fill=paint_fill, alpha=paint_alpha,
                            lw=lw, edgecolor=None)
    
    left_paint_boarder = Rectangle((0, (50/2)-6), 18.833333, 12,
                           fill=False, alpha=lines_alpha,
                           lw=lw, edgecolor=lines_color)
    right_paint_boarder = Rectangle((94-18.83333, (50/2)-6), 18.833333,
                            12, fill=False, alpha=lines_alpha,
                            lw=lw, edgecolor=lines_color)

    left_arc = Arc((18.833333, 50/2), 12, 12, theta1=-
                   90, theta2=90, color=lines_color, lw=lw,
                   alpha=lines_alpha)
    right_arc = Arc((94-18.833333, 50/2), 12, 12, theta1=90,
                    theta2=-90, color=lines_color, lw=lw,
                    alpha=lines_alpha)
    
    leftblock1 = Rectangle((7, (50/2)-6-0.666), 1, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    leftblock2 = Rectangle((7, (50/2)+6), 1, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    ax.add_patch(leftblock1)
    ax.add_patch(leftblock2)
    
    left_l1 = Rectangle((11, (50/2)-6-0.666), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    left_l2 = Rectangle((14, (50/2)-6-0.666), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    left_l3 = Rectangle((17, (50/2)-6-0.666), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    ax.add_patch(left_l1)
    ax.add_patch(left_l2)
    ax.add_patch(left_l3)
    left_l4 = Rectangle((11, (50/2)+6), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    left_l5 = Rectangle((14, (50/2)+6), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    left_l6 = Rectangle((17, (50/2)+6), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    ax.add_patch(left_l4)
    ax.add_patch(left_l5)
    ax.add_patch(left_l6)
    
    rightblock1 = Rectangle((94-7-1, (50/2)-6-0.666), 1, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    rightblock2 = Rectangle((94-7-1, (50/2)+6), 1, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    ax.add_patch(rightblock1)
    ax.add_patch(rightblock2)

    right_l1 = Rectangle((94-11, (50/2)-6-0.666), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    right_l2 = Rectangle((94-14, (50/2)-6-0.666), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    right_l3 = Rectangle((94-17, (50/2)-6-0.666), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    ax.add_patch(right_l1)
    ax.add_patch(right_l2)
    ax.add_patch(right_l3)
    right_l4 = Rectangle((94-11, (50/2)+6), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    right_l5 = Rectangle((94-14, (50/2)+6), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    right_l6 = Rectangle((94-17, (50/2)+6), 0.166, 0.666,
                           fill=True, alpha=lines_alpha,
                           lw=0, edgecolor=lines_color,
                           facecolor=lines_color)
    ax.add_patch(right_l4)
    ax.add_patch(right_l5)
    ax.add_patch(right_l6)
    
    # 3 Point Line
    if (three_line == 'mens') | (three_line == 'both'):
        # 22' 1.75" distance to center of hoop
        three_pt_left = Arc((6.25, 50/2), 44.291, 44.291, theta1=-78,
                            theta2=78, color=lines_color, lw=lw,
                            alpha=lines_alpha)
        three_pt_right = Arc((94-6.25, 50/2), 44.291, 44.291,
                             theta1=180-78, theta2=180+78,
                             color=lines_color, lw=lw, alpha=lines_alpha)

        # 4.25 feet max to sideline for mens
        ax.plot((0, 11.25), (3.34, 3.34),
                color=lines_color, lw=lw, alpha=lines_alpha)
        ax.plot((0, 11.25), (50-3.34, 50-3.34),
                color=lines_color, lw=lw, alpha=lines_alpha)
        ax.plot((94-11.25, 94), (3.34, 3.34),
                color=lines_color, lw=lw, alpha=lines_alpha)
        ax.plot((94-11.25, 94), (50-3.34, 50-3.34),
                color=lines_color, lw=lw, alpha=lines_alpha)
        ax.add_patch(three_pt_left)
        ax.add_patch(three_pt_right)

    if (three_line == 'womens') | (three_line == 'both'):
        # womens 3
        three_pt_left_w = Arc((6.25, 50/2), 20.75 * 2, 20.75 * 2, theta1=-85,
                              theta2=85, color=lines_color, lw=lw, alpha=lines_alpha)
        three_pt_right_w = Arc((94-6.25, 50/2), 20.75 * 2, 20.75 * 2,
                               theta1=180-85, theta2=180+85,
                               color=lines_color, lw=lw, alpha=lines_alpha)

        # 4.25 inches max to sideline for mens
        ax.plot((0, 8.3), (4.25, 4.25), color=lines_color,
                lw=lw, alpha=lines_alpha)
        ax.plot((0, 8.3), (50-4.25, 50-4.25),
                color=lines_color, lw=lw, alpha=lines_alpha)
        ax.plot((94-8.3, 94), (4.25, 4.25),
                color=lines_color, lw=lw, alpha=lines_alpha)
        ax.plot((94-8.3, 94), (50-4.25, 50-4.25),
                color=lines_color, lw=lw, alpha=lines_alpha)

        ax.add_patch(three_pt_left_w)
        ax.add_patch(three_pt_right_w)

    # Add Patches
    ax.add_patch(left_paint)
    ax.add_patch(left_paint_boarder)
    ax.add_patch(right_paint)
    ax.add_patch(right_paint_boarder)
    ax.add_patch(center_circle)
    ax.add_patch(hoop_left)
    ax.add_patch(hoop_right)
    ax.add_patch(left_arc)
    ax.add_patch(right_arc)
    
    if inner_arc:
        left_inner_arc = Arc((18.833333, 50/2), 12, 12, theta1=90,
                             theta2=-90, color=lines_color, lw=lw,
                       alpha=lines_alpha, ls='--')
        right_inner_arc = Arc((94-18.833333, 50/2), 12, 12, theta1=-90,
                        theta2=90, color=lines_color, lw=lw,
                        alpha=lines_alpha, ls='--')
        ax.add_patch(left_inner_arc)
        ax.add_patch(right_inner_arc)

    # Restricted Area Marker
    restricted_left = Arc((6.25, 50/2), 8, 8, theta1=-90,
                        theta2=90, color=lines_color, lw=lw,
                        alpha=lines_alpha)
    restricted_right = Arc((94-6.25, 50/2), 8, 8,
                         theta1=180-90, theta2=180+90,
                         color=lines_color, lw=lw, alpha=lines_alpha)
    ax.add_patch(restricted_left)
    ax.add_patch(restricted_right)
    
    # Backboards
    ax.plot((4, 4), ((50/2) - 3, (50/2) + 3),
            color=lines_color, lw=lw*1.5, alpha=lines_alpha)
    ax.plot((94-4, 94-4), ((50/2) - 3, (50/2) + 3),
            color=lines_color, lw=lw*1.5, alpha=lines_alpha)
    ax.plot((4, 4.6), (50/2, 50/2), color=lines_color,
            lw=lw, alpha=lines_alpha)
    ax.plot((94-4, 94-4.6), (50/2, 50/2),
            color=lines_color, lw=lw, alpha=lines_alpha)

    # Half Court Line
    ax.axvline(94/2, color=lines_color, lw=lw, alpha=lines_alpha)

    # Boarder
    boarder = Rectangle((0.3,0.3), 94-0.4, 50-0.4, fill=False, lw=3, color='black', alpha=lines_alpha)
    ax.add_patch(boarder)
    
    # Plot Limit
    ax.set_xlim(0, 94)
    ax.set_ylim(0, 50)
    ax.set_facecolor(court_color)
    ax.set_xticks([])
    ax.set_yticks([])
    ax.set_xlabel('')
    return ax


fig, ax = plt.subplots(figsize=(15, 8.5))
create_ncaa_full_court(ax, three_line='both', paint_alpha=0.4)
plt.show()

有了上邊球場作為基礎(chǔ)沪曙,在上邊繪制XY坐標(biāo)點(diǎn)將會(huì)十分直觀。

Plotting X, Y Data

  • 之后的可視化確實(shí)比較直觀和有趣萎羔,請(qǐng)往后看液走。

  • X, Y points are not available for all games- so this is not a complete sample
    XY坐標(biāo)并不是對(duì)所有比賽都是完整的,有些是缺失的贾陷,比如我想對(duì)以下的MVP進(jìn)行可視化缘眶,無奈沒有數(shù)據(jù)。髓废。

  • XY坐標(biāo)僅為以下事件fouls, turnovers, and field-goal attempts (either 2-point or 3-point)提供坐標(biāo)巷懈,其他事件沒有坐標(biāo)。

NACC標(biāo)準(zhǔn)球場大大小為94*50,將其與坐標(biāo)統(tǒng)一

# Normalize X, Y positions for court dimentions
# Court is 50 feet wide and 94 feet end to end.
MEvents['X_'] = (MEvents['X'] * (94/100))
MEvents['Y_'] = (MEvents['Y'] * (50/100))

WEvents['X_'] = (WEvents['X'] * (94/100))
WEvents['Y_'] = (WEvents['Y'] * (50/100))

分別對(duì)男女Turnover失誤的位置進(jìn)行可視化

#fouls, turnovers, and field-goal attempts (either 2-point or 3-point). No X/Y data for other events.
fig, ax = plt.subplots(figsize=(15, 7.8))
ms = 10
ax = create_ncaa_full_court(ax, paint_alpha=0.1)
MEvents.query('EventType == "turnover"') \
    .plot(x='X_', y='Y_', style='X',
          title='Turnover Locations (Mens)',
          c='red',
          alpha=0.3,
          figsize=(15, 9),
          label='Steals',
          ms=ms,#點(diǎn)的大小
          ax=ax)
ax.set_xlabel('')
ax.get_legend().remove()
plt.show()

fig, ax = plt.subplots(figsize = (15,7.8))
ms = 10
ax = create_ncaa_full_court(ax, paint_alpha=0.2)
WEvents[WEvents['EventType'] == 'turnover']\
    .plot(x = 'X_', y = 'Y_', style = 'o',
          title='Turnover Locations (Womens)',
          alpha = 0.2,
          figsize=(15, 9),
          label='Steals',
          ms=ms,
          ax = ax)
ax.set_xlabel('')
ax.get_legend().remove()
plt.show()

下面利用subplot分別將男女兩分慌洪、三分投進(jìn)顶燕、投失的位置進(jìn)行可視化

COURT_COLOR = '#dfbb85'
fig, ax = plt.subplots(2, 2, figsize=(20, 10))
ax1 = ax[0,0]
ax2 = ax[0,1]
ax3 = ax[1,0]
ax4 = ax[1,1]

# Where are 3 pointers made from? (This is really cool)
WEvents.query('EventType == "made3"') \
    .plot(x='X_', y='Y_', style='.',
          color='blue',
          title='3 Pointers Made (Womens)',
          alpha=0.01, ax=ax1)
ax1 = create_ncaa_full_court(ax1, lw=0.5, three_line='womens', paint_alpha=0.1)
ax1.set_facecolor(COURT_COLOR)

WEvents.query('EventType == "miss3"') \
    .plot(x='X_', y='Y_', style='.',
          title='3 Pointers Missed (Womens)',
          color='red',
          alpha=0.01, ax=ax2)
ax2.set_facecolor(COURT_COLOR)
ax2 = create_ncaa_full_court(ax2, lw=0.5, three_line='womens', paint_alpha=0.1)

WEvents.query('EventType == "made2"') \
    .plot(x='X_', y='Y_', style='.',
          color='blue',
          title='2 Pointers Made (Womens)',
          alpha=0.01, ax=ax3)
ax3.set_facecolor(COURT_COLOR)
ax3 = create_ncaa_full_court(ax3, lw=0.5, three_line='womens', paint_alpha=0.1)

WEvents.query('EventType == "miss2"') \
    .plot(x='X_', y='Y_', style='.',
          title='2 Pointers Missed (Womens)',
          color='red',
          alpha=0.01, ax=ax4)
ax4.set_facecolor(COURT_COLOR)
ax4 = create_ncaa_full_court(ax4, lw=0.5, three_line='womens', paint_alpha=0.1)

ax1.get_legend().remove()
ax2.get_legend().remove()
ax1.set_xticks([])
ax1.set_yticks([])
ax2.set_xticks([])
ax2.set_yticks([])
ax1.set_xlabel('')
ax2.set_xlabel('')
ax3.get_legend().remove()
ax4.get_legend().remove()
ax3.set_xticks([])
ax4.set_yticks([])
ax3.set_xticks([])
ax4.set_yticks([])
ax3.set_xlabel('')
ax4.set_xlabel('')
plt.show()

以上是對(duì)整體數(shù)據(jù)的XY坐標(biāo)的可視化,下面結(jié)合Players.csv對(duì)單個(gè)球員的數(shù)據(jù)進(jìn)行可視化冈爹。

單個(gè)球員數(shù)據(jù)的可視化

Players數(shù)據(jù)包含 ID涌攻、姓、名频伤、所在球隊(duì)的ID

#男子數(shù)據(jù)有幾行不太好  利用error_bad_lines參數(shù)進(jìn)行讀取
MPlayers = pd.read_csv(Mfolder_path+f'MPlayers.csv', error_bad_lines=False)
WPlayers = pd.read_csv(Wfolder_path+f'WPlayers.csv')
MPlayers.head()

與事件數(shù)據(jù)進(jìn)行合并

# Merge Player name onto events
MEvents = MEvents.merge(MPlayers,
              how='left',
              left_on='EventPlayerID',
              right_on='PlayerID')

WEvents = WEvents.merge(WPlayers,
              how='left',
              left_on='EventPlayerID',
              right_on='PlayerID')

看一下19年恳谎、18年冠軍及其MVP的ID

#2019   弗吉尼亞大學(xué)騎兵隊(duì)  MVP 凱爾·蓋伊
MPlayers.query('FirstName == "Donte" and LastName == "DiVincenzo"')  #但是沒有坐標(biāo)位置  沒法可視化
| 2018  維拉諾瓦大學(xué)野貓隊(duì)  MVP丹特·迪溫琴佐
MPlayers.query('FirstName == "Kyle" and LastName == "Guy"')

發(fā)現(xiàn)這兩個(gè)MVP的XY數(shù)據(jù)恰好均是缺失,所以隨意選了一個(gè)有數(shù)據(jù)的可視化憋肖,泰·杰羅姆 1997年7月8日出生于美國紐約州新羅謝爾因痛,美國職業(yè)籃球運(yùn)動(dòng)員,司職控球后衛(wèi)瞬哼,效力于NBA菲尼克斯太陽隊(duì)婚肆。 ID為12410

MEvents.query('EventPlayerID == 12410')['EventType'].value_counts()

sub         469
assist      376
reb         311
miss3       254
miss2       208
foul        201
made2       193
made3       164
turnover    144
steal       126
made1       117
fouled       63
miss1        32
block         4

對(duì)其數(shù)據(jù)進(jìn)行可視化

ms = 10 # Marker Size
fig, ax = plt.subplots(figsize=(15, 8))
ax = create_ncaa_full_court(ax)
MEvents.query('EventPlayerID == 12410 and EventType == "made2"') \
    .plot(x='X_', y='Y_', style='o',
          title='Shots (Ty Jerome)',
          alpha=0.5,
         figsize=(15, 8),
         label='Made 2',
         ms=ms,
         ax=ax)
plt.legend()
MEvents.query('EventPlayerID == 12410 and EventType == "miss2"') \
    .plot(x='X_', y='Y_', style='X',
          alpha=0.5, ax=ax,
         label='Missed 2',
         ms=ms)
plt.legend()
MEvents.query('EventPlayerID == 12410 and EventType == "made3"') \
    .plot(x='X_', y='Y_', style='o',
          c='brown',
          alpha=0.5,
         figsize=(15, 8),
         label='Made 3', ax=ax,
         ms=ms)
plt.legend()
MEvents.query('EventPlayerID == 12410 and EventType == "miss3"') \
    .plot(x='X_', y='Y_', style='X',
          c='green',
          alpha=0.5, ax=ax,
         label='Missed 3',
         ms=ms)
ax.set_xlabel('')
plt.legend()
plt.show()

來看一看 錫安·威廉姆森 ID 2825 的數(shù)據(jù)

  • 錫安-威廉姆斯(Zion Williamson)租副,2000年7月6日出生于美國南卡羅來納州斯帕坦堡坐慰,美國職業(yè)籃球運(yùn)動(dòng)員,司職大前鋒,效力于NBA[新奧爾良鵜鶘隊(duì)錫安-威廉姆斯于2019年以選秀狀元身份進(jìn)入NBA结胀。
MPlayers.query('FirstName == "Zion" and  LastName == "Williamson"')
ms = 10 # Marker Size
FirstName = 'Zion'
LastName = 'Williamson'
fig, ax = plt.subplots(figsize=(15, 8))
ax = create_ncaa_full_court(ax)
MEvents.query('EventPlayerID == 2825 and EventType == "made2"') \
    .plot(x='X_', y='Y_', style='o',
          title='Shots (Zion Williamson)',
          alpha=0.5,
         figsize=(15, 8),
         label='Made 2',
         ms=ms,
         ax=ax)
plt.legend()
MEvents.query('EventPlayerID == 2825 and EventType == "miss2"') \
    .plot(x='X_', y='Y_', style='X',
          alpha=0.5, ax=ax,
         label='Missed 2',
         ms=ms)
plt.legend()
MEvents.query('EventPlayerID == 2825 and EventType == "made3"') \
    .plot(x='X_', y='Y_', style='o',
          c='brown',
          alpha=0.5,
         figsize=(15, 8),
         label='Made 3', ax=ax,
         ms=ms)
plt.legend()
MEvents.query('EventPlayerID == 2825 and EventType == "miss3"') \
    .plot(x='X_', y='Y_', style='X',
          c='green',
          alpha=0.5, ax=ax,
         label='Missed 3',
         ms=ms)
ax.set_xlabel('')
plt.legend()
plt.show()

再來看看女子比賽的凱特-薩繆爾森(Katie Lou Samuelson) ID 3163赞咙,她以三分射手而出名,并且和庫里一樣糟港,很擅長三分線外的超遠(yuǎn)距離三分攀操,下面可視化看看實(shí)時(shí)是否真的如此?

WPlayers.query('FirstName == "Katie Lou" and  LastName == "Samuelson"')

fig, ax = plt.subplots(figsize=(15, 8))
ax = create_ncaa_full_court(ax, three_line='womens')
WEvents.query('EventPlayerID == 1821 and EventType == "made2"') \
    .plot(x='X_', y='Y_', style='o',
          title='Shots (Katie Lou Samuelson)',
          alpha=0.5,
         figsize=(15, 8),
         label='Made 2',
         ms=ms,
         ax=ax)
plt.legend()
WEvents.query('EventPlayerID == 1821 and EventType == "miss2"') \
    .plot(x='X_', y='Y_', style='X',
          alpha=0.5, ax=ax,
         label='Missed 2',
         ms=ms)
plt.legend()
WEvents.query('EventPlayerID == 1821 and EventType == "made3"') \
    .plot(x='X_', y='Y_', style='o',
          c='brown',
          alpha=0.5,
         figsize=(15, 8),
         label='Made 3', ax=ax,
         ms=ms)
plt.legend()
WEvents.query('EventPlayerID == 1821 and EventType == "miss3"') \
    .plot(x='X_', y='Y_', style='X',
          c='green',
          alpha=0.5, ax=ax,
         label='Missed 3',
         ms=ms)
ax.set_xlabel('')
plt.legend()
plt.show()

很直觀的可以看出秸抚,確實(shí)如此速和。

投射熱力圖

無論是否投中,對(duì)所有的兩分球和三分球的投射位置進(jìn)行統(tǒng)計(jì)畫熱力圖剥汤,然后分析比較男女不同運(yùn)動(dòng)員餓投射偏好颠放。

N_bins = 100
#將投籃事件  和  坐標(biāo)不為0 的事件取出
shot_events = MEvents.loc[MEvents['EventType'].isin(['miss3','made3','miss2','made2']) & (MEvents['X_'] != 0)]
fig, ax = plt.subplots(figsize=(15, 7))
ax = create_ncaa_full_court(ax,
                            paint_alpha=0.0,
                            three_line='mens',
                            court_color='black',
                            lines_color='white')
plt.hist2d(shot_events['X_'].values + np.random.normal(0, 0.1, shot_events['X_'].shape), # Add Jitter to values for plotting
           shot_events['Y_'].values + np.random.normal(0, 0.1, shot_events['Y_'].shape),
           bins=N_bins, norm=mpl.colors.LogNorm(),
               cmap='plasma')

# Plot a colorbar with label.
cb = plt.colorbar()
cb.set_label('Number of shots')

ax.set_title('Shot Heatmap (Mens)')
plt.show()

N_bins = 100
shot_events = WEvents.loc[WEvents['EventType'].isin(['miss3','made3','miss2','made2']) & (WEvents['X_'] != 0)]
fig, ax = plt.subplots(figsize=(15, 7))
ax = create_ncaa_full_court(ax, three_line='womens', paint_alpha=0.0,
                            court_color='black',
                            lines_color='white')
plt.hist2d(shot_events['X_'].values + np.random.normal(0, 0.2, shot_events['X_'].shape),
           shot_events['Y_'].values + np.random.normal(0, 0.2, shot_events['Y_'].shape),
           bins=N_bins, norm=mpl.colors.LogNorm(),
               cmap='plasma')

# Plot a colorbar with label.
cb = plt.colorbar()
cb.set_label('Number of shots')

ax.set_title('Shot Heatmap (Womens)')
plt.show()

在將男子和女子比賽進(jìn)行比較時(shí),有趣的觀察是吭敢,男子射擊的許多鏡頭都直接在籃筐下方碰凶,而女子射擊的熱點(diǎn)更多地出現(xiàn)在籃筐的左側(cè)和右側(cè)。

下面考察一下每個(gè)坐標(biāo)點(diǎn)的平均得分情況

MEvents['PointsScored'] =  0
MEvents.loc[MEvents['EventType'] == 'made2', 'PointsScored'] = 2
MEvents.loc[MEvents['EventType'] == 'made3', 'PointsScored'] = 3
MEvents.loc[MEvents['EventType'] == 'missed2', 'PointsScored'] = 0
MEvents.loc[MEvents['EventType'] == 'missed3', 'PointsScored'] = 0
avg_pnt_xy = MEvents.loc[MEvents['EventType'].isin(['miss3','made3','miss2','made2']) & (MEvents['X_'] != 0)] \
     .groupby(['X_','Y_'])['PointsScored'].mean().reset_index()
#avg_pnt_xy.plot(x='X_',y='Y_', style='.')

bins = [0,0.5,1,1.33,1.67,2,2.5,3]
avg_pnt_xy['PointsScored'] = pd.cut(avg_pnt_xy['PointsScored'],bins)
fig, ax = plt.subplots(figsize=(15, 8))
ax = sns.scatterplot(data=avg_pnt_xy, x='X_', y='Y_', hue='PointsScored')
ax = create_ncaa_full_court(ax)
plt.legend(loc=[1,0])
plt.show()

下面考察一下對(duì)每個(gè)坐標(biāo)點(diǎn)投射次數(shù)的統(tǒng)計(jì)

MEvents['Made'] = False
MEvents['Made'] = False
MEvents.loc[MEvents['EventType'] == 'made2', 'Made'] = True
MEvents.loc[MEvents['EventType'] == 'made3', 'Made'] = True
MEvents.loc[MEvents['EventType'] == 'missed2', 'Made'] = False
MEvents.loc[MEvents['EventType'] == 'missed3', 'Made'] = False
MEvents.loc[MEvents['EventType'] == 'made2', 'Missed'] = False
MEvents.loc[MEvents['EventType'] == 'made3', 'Missed'] = False
MEvents.loc[MEvents['EventType'] == 'missed2', 'Missed'] = True
MEvents.loc[MEvents['EventType'] == 'missed3', 'Missed'] = True

avg_made_xy = MEvents.loc[MEvents['EventType'].isin(['miss3','made3','miss2','made2']) & (MEvents['X_'] != 0)] \
     .groupby(['X_','Y_'])['Made','Missed'].sum().reset_index()

bins = [0,25,50,100,200,1000,2000]
avg_made_xy['Made'] = pd.cut(avg_made_xy['Made'],bins)

fig, ax = plt.subplots(figsize=(15, 8))
cmap = sns.cubehelix_palette(as_cmap=True)
ax = sns.scatterplot(data=avg_made_xy, x='X_', y='Y_', cmap='plasma', hue='Made')
ax = create_ncaa_full_court(ax, paint_alpha=0)
ax.set_title('Number of Shots Made')
plt.legend(loc=[1, 0])
plt.show()

本節(jié)完

下節(jié)帶來 - 只根據(jù)球隊(duì)的對(duì)陣信息 如何確定球隊(duì)的水平 并給出排名

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末鹿驼,一起剝皮案震驚了整個(gè)濱河市欲低,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌畜晰,老刑警劉巖砾莱,帶你破解...
    沈念sama閱讀 211,290評(píng)論 6 491
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異舷蟀,居然都是意外死亡恤磷,警方通過查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 90,107評(píng)論 2 385
  • 文/潘曉璐 我一進(jìn)店門野宜,熙熙樓的掌柜王于貴愁眉苦臉地迎上來扫步,“玉大人,你說我怎么就攤上這事匈子『犹ィ” “怎么了?”我有些...
    開封第一講書人閱讀 156,872評(píng)論 0 347
  • 文/不壞的土叔 我叫張陵虎敦,是天一觀的道長游岳。 經(jīng)常有香客問我,道長其徙,這世上最難降的妖魔是什么胚迫? 我笑而不...
    開封第一講書人閱讀 56,415評(píng)論 1 283
  • 正文 為了忘掉前任,我火速辦了婚禮唾那,結(jié)果婚禮上访锻,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好期犬,可當(dāng)我...
    茶點(diǎn)故事閱讀 65,453評(píng)論 6 385
  • 文/花漫 我一把揭開白布河哑。 她就那樣靜靜地躺著,像睡著了一般龟虎。 火紅的嫁衣襯著肌膚如雪璃谨。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 49,784評(píng)論 1 290
  • 那天鲤妥,我揣著相機(jī)與錄音佳吞,去河邊找鬼。 笑死棉安,一個(gè)胖子當(dāng)著我的面吹牛容达,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播垂券,決...
    沈念sama閱讀 38,927評(píng)論 3 406
  • 文/蒼蘭香墨 我猛地睜開眼花盐,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了菇爪?” 一聲冷哼從身側(cè)響起算芯,我...
    開封第一講書人閱讀 37,691評(píng)論 0 266
  • 序言:老撾萬榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎凳宙,沒想到半個(gè)月后熙揍,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 44,137評(píng)論 1 303
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡氏涩,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 36,472評(píng)論 2 326
  • 正文 我和宋清朗相戀三年届囚,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片是尖。...
    茶點(diǎn)故事閱讀 38,622評(píng)論 1 340
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡意系,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出饺汹,到底是詐尸還是另有隱情蛔添,我是刑警寧澤,帶...
    沈念sama閱讀 34,289評(píng)論 4 329
  • 正文 年R本政府宣布兜辞,位于F島的核電站迎瞧,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏逸吵。R本人自食惡果不足惜凶硅,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,887評(píng)論 3 312
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望扫皱。 院中可真熱鬧足绅,春花似錦压语、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,741評(píng)論 0 21
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽扰才。三九已至允懂,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間衩匣,已是汗流浹背蕾总。 一陣腳步聲響...
    開封第一講書人閱讀 31,977評(píng)論 1 265
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留琅捏,地道東北人生百。 一個(gè)月前我還...
    沈念sama閱讀 46,316評(píng)論 2 360
  • 正文 我出身青樓,卻偏偏與公主長得像柄延,于是被迫代替她去往敵國和親蚀浆。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 43,490評(píng)論 2 348

推薦閱讀更多精彩內(nèi)容