How to Specify Table for Beautifulsoup to Find

BeautifulSoup: Get the contents of a specific table

This is not the specific code you need, just a demo of how to work with BeautifulSoup. It finds the table who's id is "Table1" and gets all of its tr elements.

html = urllib2.urlopen(url).read()
bs = BeautifulSoup(html)
table = bs.find(lambda tag: tag.name=='table' and tag.has_attr('id') and tag['id']=="Table1") 
rows = table.findAll(lambda tag: tag.name=='tr')

How to specify table for BeautifulSoup to find?

In this case I'd probably just use pandas to retrieve all tables then index in for appropriate

import pandas as pd

table = pd.read_html('https://nces.ed.gov/collegenavigator/?id=139755')[10]
print(table)

If you are worried about future ordering you could loop the tables returned by read_html and test for presence of a unique string to identify table or use bs4 functionality of :has , :contains (bs4 4.7.1+) to identify the right table to then pass to read_html or continue handling with bs4

import pandas as pd
from bs4 import BeautifulSoup as bs

r = requests.get('https://nces.ed.gov/collegenavigator/?id=139755')
soup = bs(r.content, 'lxml')
table = pd.read_html(str(soup.select_one('table:has(td:contains("Average net price"))')))
print(table)

Python Beautiful Soup can't find specific table

The tables are rendered after, so you'd need to use Selenium to let it render or as mentioned above. But that isn't necessary as most of the tables are within the comments. You could use BeautifulSoup to pull out the comments, then search through those for the table tags.

import requests
from bs4 import BeautifulSoup
from bs4 import Comment
import pandas as pd

#NBA season
year = 2019

url = 'https://www.basketball-reference.com/leagues/NBA_2019.html#all_team-stats-base'.format(year)
response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')

comments = soup.find_all(string=lambda text: isinstance(text, Comment))

tables = []
for each in comments:
    if 'table' in each:
        try:
            tables.append(pd.read_html(each)[0])
        except:
            continue

This will return you a list of dataframes, so just pull out the table you want from wherever it is located by its index position:

Output:

print (tables[3])
      Rk                     Team   G     MP    FG  ...  STL  BLK   TOV    PF   PTS
0    1.0         Milwaukee Bucks*  82  19780  3555  ...  615  486  1137  1608  9686
1    2.0   Golden State Warriors*  82  19805  3612  ...  625  525  1169  1757  9650
2    3.0     New Orleans Pelicans  82  19755  3581  ...  610  441  1215  1732  9466
3    4.0      Philadelphia 76ers*  82  19805  3407  ...  606  432  1223  1745  9445
4    5.0    Los Angeles Clippers*  82  19830  3384  ...  561  385  1193  1913  9442
5    6.0  Portland Trail Blazers*  82  19855  3470  ...  546  413  1135  1669  9402
6    7.0   Oklahoma City Thunder*  82  19855  3497  ...  766  425  1145  1839  9387
7    8.0         Toronto Raptors*  82  19880  3460  ...  680  437  1150  1724  9384
8    9.0         Sacramento Kings  82  19730  3541  ...  679  363  1095  1751  9363
9   10.0       Washington Wizards  82  19930  3456  ...  683  379  1154  1701  9350
10  11.0         Houston Rockets*  82  19830  3218  ...  700  405  1094  1803  9341
11  12.0            Atlanta Hawks  82  19855  3392  ...  675  419  1397  1932  9294
12  13.0   Minnesota Timberwolves  82  19830  3413  ...  683  411  1074  1664  9223
13  14.0          Boston Celtics*  82  19780  3451  ...  706  435  1052  1670  9216
14  15.0           Brooklyn Nets*  82  19980  3301  ...  539  339  1236  1763  9204
15  16.0       Los Angeles Lakers  82  19780  3491  ...  618  440  1284  1701  9165
16  17.0               Utah Jazz*  82  19755  3314  ...  663  483  1240  1728  9161
17  18.0       San Antonio Spurs*  82  19805  3468  ...  501  386   992  1487  9156
18  19.0        Charlotte Hornets  82  19830  3297  ...  591  405  1001  1550  9081
19  20.0          Denver Nuggets*  82  19730  3439  ...  634  363  1102  1644  9075
20  21.0         Dallas Mavericks  82  19780  3182  ...  533  351  1167  1650  8927
21  22.0          Indiana Pacers*  82  19705  3390  ...  713  404  1122  1594  8857
22  23.0             Phoenix Suns  82  19880  3289  ...  735  418  1279  1932  8815
23  24.0           Orlando Magic*  82  19780  3316  ...  543  445  1082  1526  8800
24  25.0         Detroit Pistons*  82  19855  3185  ...  569  331  1135  1811  8778
25  26.0               Miami Heat  82  19730  3251  ...  627  448  1208  1712  8668
26  27.0            Chicago Bulls  82  19905  3266  ...  603  351  1159  1663  8605
27  28.0          New York Knicks  82  19780  3134  ...  557  422  1151  1713  8575
28  29.0      Cleveland Cavaliers  82  19755  3189  ...  534  195  1106  1642  8567
29  30.0        Memphis Grizzlies  82  19880  3113  ...  684  448  1147  1801  8490
30   NaN           League Average  82  19815  3369  ...  626  406  1155  1714  9119

[31 rows x 25 columns]

Find specific table using BeautifulSoup with specific caption

I would try to find all captions and then to match the caption text like this:

from bs4 import BeautifulSoup
import re
import requests


header = {'User-agent' : 'Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5'}

redirect = requests.get('http://goblueraiders.com/boxscore.aspx?path=baseball&id=6117', headers = header).text
soup = BeautifulSoup(redirect, 'html.parser')

for caption in soup.find_all('caption'):
    if caption.get_text() == 'Tennessee Tech - Pitching Stats':
        table = caption.find_parent('table', {'class': 'sidearm-table collapse-on-medium accordion'})

BeautifulSoup - find table with specified class on Wikipedia page

You shouldn't use jquery-tablesorter to select against in the response you get from requests because it is dynamically applied after the page loads. If you omit that, you should be good to go.

tab = soup.find("table",{"class":"wikitable sortable"})

python BeautifulSoup parsing table

Here you go:

data = []
table = soup.find('table', attrs={'class':'lineItemsTable'})
table_body = table.find('tbody')

rows = table_body.find_all('tr')
for row in rows:
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    data.append([ele for ele in cols if ele]) # Get rid of empty values

This gives you:

[ [u'1359711259', u'SRF', u'08/05/2013', u'5310 4 AVE', u'K', u'19', u'125.00', u'$'], 
  [u'7086775850', u'PAS', u'12/14/2013', u'3908 6th Ave', u'K', u'40', u'125.00', u'$'], 
  [u'7355010165', u'OMT', u'12/14/2013', u'3908 6th Ave', u'K', u'40', u'145.00', u'$'], 
  [u'4002488755', u'OMT', u'02/12/2014', u'NB 1ST AVE @ E 23RD ST', u'5', u'115.00', u'$'], 
  [u'7913806837', u'OMT', u'03/03/2014', u'5015 4th Ave', u'K', u'46', u'115.00', u'$'], 
  [u'5080015366', u'OMT', u'03/10/2014', u'EB 65TH ST @ 16TH AV E', u'7', u'50.00', u'$'], 
  [u'7208770670', u'OMT', u'04/08/2014', u'333 15th St', u'K', u'70', u'65.00', u'$'], 
  [u'$0.00\n\n\nPayment Amount:']
]

Couple of things to note:

The last row in the output above, the Payment Amount is not a part
of the table but that is how the table is laid out. You can filter it
out by checking if the length of the list is less than 7.
The last column of every row will have to be handled separately since it is an input text box.

BeautifulSoup can't find table

Need selenium to extract the table data because data load through JavaScript. as an example i here extract the table one data and save to csv file.

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
import time
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

url = 'https://www.nba.com/standings?GroupBy=conf&Season=2019-20&Section=overall'
driver = webdriver.Chrome(r"C:\Users\Subrata\Downloads\chromedriver.exe")
driver.get(url)

soup = BeautifulSoup(driver.page_source, 'html.parser')
tables = soup.select('div.StandingsGridRender_standingsContainer__2EwPy')
table1 = []
for td in tables[0].find_all('tr'):
    first =[t.getText(strip=True, separator=' ') for t in td]
    table1.append(first)


df = pd.DataFrame(table1[1:], columns=table1[0] )

df.to_csv('x.csv')

Beautiful Soup can't find tables

Try to disable javascript when you visit https://covid.knoxcountytn.gov/case-count.html and you will see no table. As @barny said the table is generated with javascript so you can't parse it with BeautifulSoup (at least not easily, see How to call JavaScript function using BeautifulSoup and Python).

how to select a particular table and print its data using beautifulsoup

Expanding on @furas' comment slightly, as report_tables[4] assumes it will always be the 5th table:

req = requests.get("https://www.ssllabs.com/ssltest/analyze.html?d=drtest.test.sentinelcloud.com")
data = req.text
soup = BeautifulSoup(data)

for found_table in soup.find_all('table', class_='reportTable'):
    if 'Cipher Suites' in found_table.get_text():
        values = found_table.find_all('td', class_='tableLeft')
        entries = []
        for row in values:
            entries.append(row.get_text())
        print entries

Checking for 'Cipher Suites' (though you could use a more complete title if needs be) should help you get the correct table more consistently.

You could simple use values as an output, but using get_text() helps us remove some of the html that you likely won't need. entries will contain the values you require, but you might need to look into functions like strip to clear whitespace from the results.

PRODUCED RESULT:

[u'\n                                            TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256\n                                        (0xc02f)\n                                                            \xa0  ECDH secp256r1 (eq. 3072 bits RSA) \xa0 FS\n', u'\n                                            TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384\n                                        (0xc030)\n                                                            \xa0  ECDH secp256r1 (eq. 3072 bits RSA) \xa0 FS\n', u'\n                                            TLS_DHE_RSA_WITH_AES_128_GCM_SHA256\n                                        (0x9e)\n                                 \xa0\n                                    \nDH 2048 bits \xa0 FS\n', u'\n                                            TLS_DHE_RSA_WITH_AES_256_GCM_SHA384\n                                        (0x9f)\n                                 \xa0\n                                    \nDH 2048 bits \xa0 FS\n', u'\n                                            TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256\n                                        (0xc027)\n                                                            \xa0  ECDH secp256r1 (eq. 3072 bits RSA) \xa0 FS\n', u'\n                                            TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA\n                                        (0xc013)\n                                                            \xa0  ECDH secp256r1 (eq. 3072 bits RSA) \xa0 FS\n', u'\n                                            TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384\n                                        (0xc028)\n                                                            \xa0  ECDH secp256r1 (eq. 3072 bits RSA) \xa0 FS\n', u'\n                                            TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA\n                                        (0xc014)\n                                                            \xa0  ECDH secp256r1 (eq. 3072 bits RSA) \xa0 FS\n', u'\n                                            TLS_DHE_RSA_WITH_AES_128_CBC_SHA256\n                                        (0x67)\n                                 \xa0\n                                    \nDH 2048 bits \xa0 FS\n', u'\n                                            TLS_DHE_RSA_WITH_AES_128_CBC_SHA\n                                        (0x33)\n                                 \xa0\n                                    \nDH 2048 bits \xa0 FS\n', u'\n                                            TLS_DHE_RSA_WITH_AES_256_CBC_SHA256\n                                        (0x6b)\n                                 \xa0\n                                    \nDH 2048 bits \xa0 FS\n', u'\n                                            TLS_DHE_RSA_WITH_AES_256_CBC_SHA\n                                        (0x39)\n                                 \xa0\n                                    \nDH 2048 bits \xa0 FS\n', u'\n                                            TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA\n                                        (0xc012)\n                                                            \xa0  ECDH secp256r1 (eq. 3072 bits RSA) \xa0 FS\n', u'\n                                            TLS_RSA_WITH_AES_128_GCM_SHA256\n                                        (0x9c)\n                                                                \n                    \n                ', u'\n                                            TLS_RSA_WITH_AES_256_GCM_SHA384\n                                        (0x9d)\n                                                                \n                    \n                ', u'\n                                            TLS_RSA_WITH_AES_128_CBC_SHA256\n                                        (0x3c)\n                                                                \n                    \n                ', u'\n                                            TLS_RSA_WITH_AES_256_CBC_SHA256\n                                        (0x3d)\n                                                                \n                    \n                ', u'\n                                            TLS_RSA_WITH_AES_128_CBC_SHA\n                                        (0x2f)\n                                                                \n                    \n                ', u'\n                                            TLS_RSA_WITH_AES_256_CBC_SHA\n                                        (0x35)\n                                                                \n                    \n                ', u'\n                                            TLS_DHE_RSA_WITH_CAMELLIA_256_CBC_SHA\n                                        (0x88)\n                                 \xa0\n                                    \nDH 2048 bits \xa0 FS\n', u'\n                                            TLS_RSA_WITH_CAMELLIA_256_CBC_SHA\n                                        (0x84)\n                                                                \n                    \n                ', u'\n                                            TLS_DHE_RSA_WITH_CAMELLIA_128_CBC_SHA\n                                        (0x45)\n                                 \xa0\n                                    \nDH 2048 bits \xa0 FS\n', u'\n                                            TLS_RSA_WITH_CAMELLIA_128_CBC_SHA\n                                        (0x41)\n                                                                \n                    \n                ', u'\n                                            TLS_RSA_WITH_3DES_EDE_CBC_SHA\n                                        (0xa)\n                                                                \n                    \n                ']

EDIT: to expand this in line with @PadraicCunningham's comments, we can remove the whitespace and return the first value as follows:

for found_table in soup.find_all('table', class_='reportTable'):
    if 'Cipher Suites' in found_table.get_text():
        vals = [td.text.split()[0] for td in found_table.select("td.tableLeft")]
        print vals
        break