Advertise here




Advertise here

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In with Google Sign In with OpenID
Please do not post the same thing multiple times. The board software automatically flags certain posts as needing moderator attention. This happens the most often for new users. I'm pretty sure this is made clear at the time you attempt to post. Posting the same thing over and over again just makes that many more posts the moderators have to weed through later. This makes us sad. Don't make us sad. If your post/thread doesn't appear, just wait a while. Don't post it again. If it hasn't shown up by the next day, then you can try again. I normally go through posts in the mornings, and try to check a few times throughout the day, but I'm not here 24/7. There will typically be a significant delay before posts are approved. Just be patient.

NSScanner skipping?

simpleiPhonesdksimpleiPhonesdk Posts: 63Registered Users
Hi Everyone,

I have an NSScanner set up to scan through an NSString that contains HTML. I want to sift through the HTML and get the main text, author, and date from a website. The code correctly grabs all the titles, but when it goes for the authors, it skips the very first one messing up which author goes to their correct post.

Here is my code (and I have double checked with the HTML online source to verify the tags are correct). Does anyone see something wrong with the code?


NSScanner *theScanner;
NSString *text = nil;
NSString *author = nil;
NSString *date = nil;

theScanner = [NSScanner scannerWithString:html];

while ([theScanner isAtEnd] == NO) {

// find start of tag
[theScanner scanUpToString:@\"<pre class=\\"content\\">\" intoString:NULL] ;
// find end of tag
[theScanner scanUpToString:@\"</pre>\" intoString:&text];
text = [text stringByReplacingOccurrencesOfString:@\"<pre class=\\"content\\">\" withString:@\"\"];
[titlesArray addObject:text];

// find start of tag
[theScanner scanUpToString:@\"<div class=\\"date\\">\" intoString:NULL] ;
// find end of tag
[theScanner scanUpToString:@\"</div>\" intoString:&date];
date = [date stringByReplacingOccurrencesOfString:@\"<div class=\\"date\\">\" withString:@\"\"];
[datesArray addObject:date];

// find start of tag
[theScanner scanUpToString:@\"<div class=\\"date\\">Posted By: \" intoString:NULL] ;
// find end of tag
[theScanner scanUpToString:@\"</div>\" intoString:&author];
author = [author stringByReplacingOccurrencesOfString:@\"<div class=\\"date\\">Posted By: \" withString:@\"\"];
[authorsArray addObject:author];

}


Output (What it should be):
Title 1: This is post 1
Author: Joe

Title 2: This is post 2
Author: Bob

Output (What is it now/incorrect)
Title 1: This is post 1
Author: Bob

Title 2: This is post 2
Author: Karly

Thanks everyone :)
Post edited by simpleiPhonesdk on

Replies

  • FstuffFstuff Posts: 154Registered Users
    I'm guessing that the html string is not ordered in the way that you expect. But I can't say for sure without seeing the string.
  • simpleiPhonesdksimpleiPhonesdk Posts: 63Registered Users
    Fstuff;439933 said:
    I'm guessing that the html string is not ordered in the way that you expect. But I can't say for sure without seeing the string.
    Here is one part of the HTML string


    <html>
    <head>
    <title>Forum</title>
    <link rel=\"stylesheet\" type=\"text/css\" href=\"/static/forum.css\" />


    </head>
    <body>



    <div class=\"login-link\">
    dummy_1
    <a href=\"/blog/logout\">(logout)</a>
    </div>


    <a href=\"/blog\"><div class=\"title\">Forum</div><br></a>

    <center><form method=\"post\" class=\"newpostform\">
    <div>New Post</div>
    <table>
    <tr>
    <td class=\"label\">Subject</td>
    <td><input name= \"subject\" class=\"newpostinput\"></td>
    </tr>

    <tr>
    <td class=\"label\">Content</td>
    <td><textarea class=\"newposttextarea\" name=\"content\"></textarea></td>
    </tr>
    <tr><td></td><td><input type = \"submit\" class=\"button\"></td></tr>
    <tr>
    <td class=\"error\"></td>
    </tr>
    </table>
    </form></center>


    <a href = /blog/32001><div class=\"subject\">This is a test</div></a>
    <div class=\"date\">Jun 29, 2012</div>
    <div class=\"date\">Posted By: dummy_1</div><br>

    <pre class=\"content\">Test</pre>
  • FstuffFstuff Posts: 154Registered Users
    The first line that you scan for appears to be the last line of the content that you posted. So, yeah, you've got things out of order.
  • simpleiPhonesdksimpleiPhonesdk Posts: 63Registered Users
    Fstuff;439986 said:
    The first line that you scan for appears to be the last line of the content that you posted. So, yeah, you've got things out of order.
    Oh.. duh.. nice catch Fstuff.

    Thanks.
Sign In or Register to comment.