• Guest, before posting your code please take these rules into consideration:
    • It is required to use our BBCode feature to display your code. While within the editor click < / > or >_ and place your code within the BB Code prompt. This helps others with finding a solution by making it easier to read and easier to copy.
    • Don't share a wall of code. All we want is the problem area, the code related to your issue.
    To learn more about how to use our BBCode feature, please click here.

    Thank you, Code Forum.

Trying to Scrape an Array

makamo66

New Coder
May 20, 2020
20
0
1
I'm trying to scrape the following:


HTML:
<tr>
        <td style="font-family:eurof;font-size:14px;padding-top:0px;padding-bottom:5px;"><a style="font-family:eurof;font-size:14px;" href="jewelry">JEWELRY</a> &nbsp;&gt;&nbsp; <a style="font-family:eurof;font-size:14px;" href="jewelry/anklet">ANKLET</a> &nbsp;&gt;&nbsp; <a style="font-family:eurof;font-size:14px;" href="jewelry/anklet/fashion">FASHION</a> &nbsp;&gt;&nbsp; <a style="font-family:eurof;font-size:14px;" href="jewelry/anklet/fashion/"></a>
        </td>
      </tr>
I use:

Code:
var categories = [];
const cheerio = require("cheerio");
const $ = await cheerio.load(content);
categories.push($('a[style="font-family:eurof;font-size:14px;"]').text());

console.log(categories);
The results of my console.log are:

[ 'JEWELRYANKLETFASHION' ]

I want to get

[ 'JEWELRY','ANKLET','FASHION' ]
 
Last edited by a moderator:

Master Yoda

Administrator
Administrator
Verified
Jan 2, 2018
1,720
431
95
Canada
codeforum.org
Hello,

Please use the BBCode, the notice is at the top of every forum as a helpful reminder.

Also can you please share the solution followed by the source. Some coders may not feel comfortable leaving the site and makes it quick and easy.
 

makamo66

New Coder
May 20, 2020
20
0
1
$('a[style="font-family:eurof;font-size:14px;"]').each(function() {
categories.push($(this).text()) })
console.log(categories);
 

Ghost

Active Coder
Apr 19, 2019
407
184
43
That's strange that you are getting a return like that.
I replicated it in Vanilla JS and had no issues:
JavaScript:
<script>
window.onload = function(){
    var categories = [];
    const cheerio = require("cheerio");
    const $ = await cheerio.load(content);
    document.querySelectorAll("a[style='font-family:eurof;font-size:14px;']").forEach(function(link){
        if(typeof link.innerText.toLowerCase() == "string" && link.innerText.trim().length > 0){
            categories.push(link.innerText);
        }

    });
    console.log(categories);
}

</script>
To test it working without Cheerio code:
HTML:
        <td style="font-family:eurof;font-size:14px;padding-top:0px;padding-bottom:5px;"><a style="font-family:eurof;font-size:14px;" href="jewelry">JEWELRY</a> &nbsp;&gt;&nbsp; <a style="font-family:eurof;font-size:14px;" href="jewelry/anklet">ANKLET</a> &nbsp;&gt;&nbsp; <a style="font-family:eurof;font-size:14px;" href="jewelry/anklet/fashion">FASHION</a> &nbsp;&gt;&nbsp; <a style="font-family:eurof;font-size:14px;" href="jewelry/anklet/fashion/"></a>
JavaScript:
<script>

window.onload = function(){

    var categories = [];

    document.querySelectorAll("a[style='font-family:eurof;font-size:14px;']").forEach(function(link){

        if(typeof link.innerText.toLowerCase() == "string" && link.innerText.trim().length > 0){

            categories.push(link.innerText);

        }



    });

    console.log(categories);

}



</script>
 

makamo66

New Coder
May 20, 2020
20
0
1
Thanks, Ghost.
When I use your code, I get the error:
ReferenceError: document is not defined

I tried removing the document key word and got

ReferenceError: querySelectorAll is not defined
 

Ghost

Active Coder
Apr 19, 2019
407
184
43
Thanks, Ghost.
When I use your code, I get the error:
ReferenceError: document is not defined
I tried removing the document key word and got
ReferenceError: querySelectorAll is not defined
That's odd, but must be due to the system you are working in.
Try this - it uses jQuery and loops through each result it finds.
Your code is clumping them together because you are pushing ALL of the selected elements found into categories at once, instead of going through each found element.
JavaScript:
var categories = [];
const cheerio = require("cheerio");
const $ = await cheerio.load(content);
$('a[style="font-family:eurof;font-size:14px;"]').each(function(resultnum){
    if(typeof $(this).text().toLowerCase() == "string" && $(this).text().length > 0){
        categories.push($(this).text());
    }
})
console.log(categories);
 

makamo66

New Coder
May 20, 2020
20
0
1
Thank you Ghost. That worked but now I need to get the elements of the array separately to put in the return statement. I tried:
return {categories[0],categories[1], categories[2]}
but I got the error Unexpected Token
Do you know how to do this?
 

makamo66

New Coder
May 20, 2020
20
0
1
I circumvented the error by putting the return statement right after the block so I didn't get the error

ReferenceError: singleObject is not defined

return {singleObject};

The problem now is that I get the entire object in one cell of my spreadsheet like this:

{"JEWELRY":"some value","ANKLET":"some value","FASHION":"some value","":"some value"}

I want each property of the object to have its own cell.
 

makamo66

New Coder
May 20, 2020
20
0
1
I finally got it to work with a method that I had tried before I moved the return statement. It didn't work previously but it does now.

let category1 = categories[0];
let category2 = categories[1];
let category3 = categories[2];

return {category1, category2, category3};