i hate nvidia
nvidia is a piece of shit
', 'html.parser') print(soup.p) #i hate nvidia
``` 如果想要获取soup对象中所有`child-tag-type`类型的标签,需要调用`find_all`方法。`find_all`将会在整个树结构中寻找指定类型的子节点,不仅包含其直接子节点,还包含间接子节点: ```py soup = BeautifulSoup( 'i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
,nvidia is a piece of shit
,fuck Jensen Huang
] ``` ### .contents 通过`contents`属性,可以获取直接子节点: ```py soup = BeautifulSoup( 'i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
,nvidia is a piece of shit
,fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
#nvidia is a piece of shit
#fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
i hate nvidianvidia is a piece of shit
nvidia is a piece of shitfuck Jensen Huang
fuck Jensen Huang
fuck Jensen Huang ``` ### .string 如果tag只有一个NavigableString类型的子节点,那么直接可以通过`string`属性进行访问: ```py soup = BeautifulSoup( 'i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
print(mid_node.next_sibling) #fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
] [fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
,nvidia is a piece of shit
,fuck Jensen Huang
] ``` #### 正则 在根据name查询时,可以适配正则 ```py soup = BeautifulSoup( 'i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
,nvidia is a piece of shit
,fuck Jensen Huang
fuck Jensen Huang
] ``` #### True 如果要查询所有的tag,可以向name传入True ```py soup = BeautifulSoup( 'i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
fuck Jensen Huang
``` #### 自定义方法 除了上述外,还可以自定义过滤方法来对tag对象进行过滤 ```py def is_match(tag): return tag.name == 'p' and 'id' in tag.attrs and tag.attrs['id'] == '100' soup = BeautifulSoup( 'i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
nvidia is a piece of shit
``` ### 属性查找 如果为find_all方法指定了一个命名参数,但是该参数不是find_all方法的内置命名参数,那么会将该参数名称作为属性名称进行查找: ```py soup = BeautifulSoup( 'i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
``` 根据属性查找,还可以通过向attrs参数传递一个字典: ```py soup = BeautifulSoup( 'i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
nvidia is a piece of shit
``` ### 按class进行搜索 可以通过指定class_来按class进行搜索 ```py soup = BeautifulSoup( 'i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
nvidia is a piece of shit
``` ### 按string 指定string后,可以针对html文档中的字符串内容进行搜索,搜索中的元素只会是NavigableString类型 ```py soup = BeautifulSoup( 'i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
i hate nvidia
nvidia is a piece of shit
fuck Jensen Huang
...
] ``` 逐层查找 ```py soup.select("body a") # [Elsie, # Lacie, # Tillie] soup.select("html head title") # [