/ and... in xpath/

Keywords: REST

xpath locates elements in // and. / ways, / / ways are to locate all the elements that match the entire page, and. / is to select under the current node, but I still don't know.

Question:

When I used xpath to select all div s with class = quote, I chose a total of 10 as shown below, but I used one of them.
For example, when the first text is further extracted, such as

>>>quotes = response.xpath('//div[@class = "quote"]')
>>>quotes[0].xpath('//span[1]/text()').extract()
The result is to extract all the text of the tag, not just the text under this tag, as I want to see below.
When I use the. / current node to select, I can extract one.
>>>quotes[0].xpath('./span[1]/text()').extract()
>>>['"The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking."']

The problem is that although / / all the eligible labels are selected, I have specified that I only choose under this div (quotes[0]). It is true that only one div is printed and there is no rest, but why?

>>>quotes[0].xpath('//span[1]/text()').extract()

This way of writing (//) will extract all eligible results for the entire page?

Similarly, using css selector for extraction will not have this problem.

Details are as follows:

Target overall structure:

First div,Each of the latter div The structure is the same as this one.
    <div class="quote" itemscope itemtype="http://schema.org/CreativeWork">
        <span class="text" itemprop="text">"The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking."</span>
        <span>by <small class="author" itemprop="author">Albert Einstein</small>
        <a href="/author/Albert-Einstein">(about)</a>
        </span>
        <div class="tags">
            Tags:
            <meta class="keywords" itemprop="keywords" content="change,deep-thoughts,thinking,world" /    >            
            <a class="tag" href="/tag/change/page/1/">change</a>            
            <a class="tag" href="/tag/deep-thoughts/page/1/">deep-thoughts</a>           
            <a class="tag" href="/tag/thinking/page/1/">thinking</a>            
            <a class="tag" href="/tag/world/page/1/">world</a>
        </div>
    </div>

xpath Choose each div
>>> quotes = response.xpath('//div[@class = "quote"]')
>>> quotes
[<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>, 
<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>, 
<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>,
<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>, 
<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>, 
<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>, 
<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>,
<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>,
<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>, 
<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>]
>>>
>>> len(quotes)
10
>>> quotes[0]		#There was only one result.
<Selector xpath='//div[@class = "quote"]' data='<div class="quote" itemscope itemtype...'>
>>> print(quotes[0].extract())		#This is the only one printed, but
<div class="quote" itemscope itemtype="http://schema.org/CreativeWork">
        <span class="text" itemprop="text">"The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking."</span>
        <span>by <small class="author" itemprop="author">Albert Einstein</small>
        <a href="/author/Albert-Einstein">(about)</a>
        </span>
        <div class="tags">
            Tags:
            <meta class="keywords" itemprop="keywords" content="change,deep-thoughts,thinking,world">
            <a class="tag" href="/tag/change/page/1/">change</a>
            <a class="tag" href="/tag/deep-thoughts/page/1/">deep-thoughts</a>
            <a class="tag" href="/tag/thinking/page/1/">thinking</a>
            <a class="tag" href="/tag/world/page/1/">world</a>
        </div>
    </div>
    
# But why extracting the text from the span[1] in quotes[0] will extract all the real pages?
# If you are lucky to be seen by any of your predecessors, please do help me talk about it.
    >>> quotes[0].xpath('//span[1]/text()').extract()
['"The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking."', 
'"It is our choices, Harry, that show what we truly are, far more than our abilities."', 
'"There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle."', 
'"The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid."',
 ""Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring."",
 '"Try not to become a man of success. Rather become a man of value."', '"It is better to be hated for what you are than to be loved for what you are not."',
 ""I have not failed. I've just found 10,000 ways that won't work."", ""A woman is like a tea bag; you never know how strong it is until it's in hot water."", 
 '"A day without sunshine is like, you know, night."', '→', '\n            ', '\n            ', '❤']


# Using. /(under the current node) to extract only the text of this node,
>>> quotes[0].xpath('./span[1]/text()').extract()
['"The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking."']
>>> quotes[1].xpath('./span[1]/text()').extract()
['"It is our choices, Harry, that show what we truly are, far more than our abilities."']
>>>

Posted by bschmitt78 on Sun, 06 Oct 2019 13:02:17 -0700