Reading a File Where Some Have 2 Words
Howdy,
I am struggling with finding the best mode to read private words in a text file.
I tin can use getline to extract whole lines, but I need to pull out private words and store them in a cord array for comparison to another word. Overall it's just a word search plan.
Is there a simple way of reading words in C++ or do I have to accept it look at every character and detect spaces or something?
Thanks for the help!
The formatted extraction operator, >>
, already does this. It reads until the next whitespace and skips any leading or trailing whitespace.
Terminal edited on
The operator >>
skips only leading whitespace. It stops extracting when information technology reaches the end of a discussion.
I only ran into this issue myself, only I wish to have information technology ignore the whitespaces.
When you lot say 'ignore the whitespaces', this could mean one of many things - could you lot be more than specific? To me, >> already ignores whitepaces, just I assume this is not what you mean.
I mean I was attempting to grab the whole line, and not individual words. I didn't really intend to hijack the thread, though. It appears the OP wants to catch individual words, rather than whole lines. At that place must be something fundamentally dissimilar in how we've coded, resulting in us each applying the code in the wrong way.This is the code I have , which could very well be the solution he is looking for.
| |
Please enter a string to exist written to the file: Hey You lot Writing to file... Reading from file... The file contents are: Hey
When I check the contents of information.txt manually, I observe that only Hey is present, and so I could exist wrong that the issue was my getline implementation. Rather, it appears to exist tied to cin. Changing line 15 to the following makes it work properly.
getline(cin,w_content);
Please enter a string to be written to the file: Hey You Writing to file... Reading from file... The file contents are: Hey You
Then in the stop, I solved my own question. If Darth shows us some code, I'm sure we can nudge him in the right direction. I know that I've used getline to treat commas as delimeters, and put it in a loop to assign each word to it's own array position. Depending on the structure of his input file, this may or may not exist a applied solution, or you may wish to have it treat spaces the aforementioned way. Endeavor using getline like this within a loop:
getline(inputfilehandle,southward[i],' ');
where due south is an assortment of strings and i is the loop counter. If y'all are confused about the inputfilehandle part, take a await at how I used infile and outfile above.
Last edited on
And then it appears I demand to use >>
. I will try that. Right now the code only uses getline to grab entire lines of text from a text file. So what I need is to figure out how to properly implement the >>
operator in my code. Let me endeavour information technology and if I tin can't effigy it out I'll post my code.
Thanks for the help guys!
Some further experimentation reveals that calculation the ' '
to the third parameter makes it treat spaces equally delimeters.
Here's where I'm at...
| |
This is killing me, it's supposed to be function of a programme I fabricated that can search within docx files for keywords, and everything works except I can't read individual words in a text file I feel stupid!
How tin can I make this read an individual word? Right now the panel appears and just sits there, nothing displayed, doesn't seem to use a lot of memory or anything so I don't think it's an infinite loop or annihilation.
Ideas? I tried using .eof() and .proficient() in the loop, no luck.
Try changing line 22 to getline(Test,testline,' ');
and encounter what happens.
EDIT: Actually information technology required some other change. Effort the change mentioned in a higher place, simply also remove line 23. Also, on line xv, you tin just use !Test
as the status for your if statement.
Check out this code, adapted from yours.
| |
My input file looks like this:
Steven Seagal 1234 Post Drive Ventura, CA 90734 Adam Sandler 356 Golf game Street Calabasas, CA 92136
and my output looked similar this:
Steven Seagal 1234 Mail Drive Ventura, CA 90734 Adam Sandler 356 Golf Street Calabasas, CA 92136 Array contents: Steven Seagal 1234 Post Drive Ventura, CA 90734 Adam Sandler 356 Golf game Street Calabasas, CA 92136
It's non perfect though, when outputting discussion[1]
it outputted
and then more is needed to assistance it differentiate between words and numerical values, or it has to do with retentiveness of invisible newline characters. I'm going to be looking into this farther.
Last edited on
why non just use:
| |
? it grabs whitespace seperated words
If you use getline with ' ' as the delimiter, it will not skip multiple spaces and it will keep other forms of whitespace that >> would normally skip.
This would explain a lot about my plan'south beliefs. Upon modification, I realized that some values, despite existence outputted, were not properly being stored. I changed the assortment output loop at the end to
| |
hoping to display each assortment position's string. Not only did some entries output to multiple lines, only some values such every bit the start zippo code, were never stored. If y'all output
it is the CA from the first address, yet when outputting
it but displays Adam.
I tried making changes as suggested to a higher place, just it merely fabricated things worse.
| |
1234 Ventura, Adam 356 Calabasas,92136 Array contents: (0) 1234(1) (2) Ventura,(3) (iv) Adam(5) 356(6) (7) Calabasas,(8) (nine) 92136(10)
Irresolute the status of the while loop to merely be Examination
on line 25 results in closer to the desired output, yet withal displays some oddities.
Steven Seagal 1234 Postal service Drive Ventura, CA 90734 Adam Sandler 356 Golf game Street Calabasas, CA 92136 92136 Array contents: Steven(0) Seagal 1234(one) Post(2) Drive Ventura,(3) CA(4) 90734 Adam(five) Sandler 356(6) Golf(7) Street Calabasas,(eight) CA(9) 92136(10) 92136(11)
I take two questions:
#one. Why does it output and store the concluding nothing code twice when reading it in?
#2. Why are multiple strings beingness combined into single entries? (I assume information technology has something to exercise with newline characters, but am unsure). Outputting
gives me
at present.
Withal I wish to store the information as 16 differing entries, not 12, and without the back-up of storing zippo codes twice.
Terminal edited on
I got it mostly solved now. LB was right all along most the utilize of >>
, it really is key to reading in private words.
| |
Array contents: Steven(0) Seagal(1) 1234(2) Post(3) Drive(four) Ventura,(v) CA(6) 90734(7) Adam(8) Sandler(9) 356(x) Golf(eleven) Street(12) Calabasas,(13) CA(fourteen) 92136(xv) 92136(xvi)
Why is it storing the final value twice? Other than that, this code should demonstrate how to collect individual words from file input.
EDIT: Problem solved. I changed the condition of the while loop on line 24 to !Examination.eof()
and this prevented the terminal entry being stored twice. Using the above code with that change should permit you lot to read any entries inside a text file to an array or vector of strings. If choosing to go with an array, use word[i]=testline;
with i as a counter, instead of push_back within of the while loop. Just and then you know Darth, the console window was awaiting input considering of line 23 in the code you posted last.
Last edited on
CplusplusAcolyte wrote: |
---|
Why is it storing the last value twice? |
Because your loop condition is incorrect:
| |
When you read the last bytes of a file, it does not ready whatsoever eof or bad flags, so the stream is still in good country, significant yous run the loop an actress fourth dimension. On this extra time, the input performance fails, leaving 'testline' unchanged, and then since you don't care that it failed, y'all push it onto the vector again. E'er perform input in the loop condition:
| |
CplusplusAcolyte wrote: |
---|
EDIT: Trouble solved. I inverse the condition of the while loop on line 26 to !Exam.eof() and this prevented the final entry being stored twice. |
No, it didn't, you just got lucky. Never loop on eof.
Last edited on
Thanks LB, that explained a lot. So my before attempt in using the loop the way you take shown didn't piece of work considering I included my redundant older attempt to read words into the string.
| |
is much simpler. While I hadn't heard that loop conditions based on eof were bad do, I like this solution better anyway, and it greatly simplified my code.
| |
Would you similarly say that looping based on vector size was bad practice, or is this determination limited to eof?
Concluding edited on
Tin can't thank you guys plenty! I'thou learning a lot hither.
I don't understand vectors, I'1000 all the same a noob and it wasn't covered in the class I took. Later some googling, my understanding is a vector is essentially a dynamic array... Is that right? And what does push_back()
practice?
Can I utilize strcmp with a vector that's a string type?
Thanks again for all the aid I really capeesh it!
You are right, a vector is like to an array, except the size does non need to exist known. push_back(testline)
adds testline to the last position in the vector, and is similar to array[i]=testline;
, although with an array, you demand to know what position you are adding it to and make use of a loop to control the value of i.
The code below does the aforementioned thing, but stores the values in an assortment then you can see the difference. Yous will see that we had to declare how many values the array holds, and had to base our output loop on the size of the array.
An array cannot have values added to information technology beyond the number specified at declaration, wheras a vector can have additional entries added through the utilize of push_back
, and doesn't need to know how many values it will agree. Lastly, a vector allows us to output without knowing the number of values by using vectorname.size() in the output loop.
| |
Terminal edited on
CplusplusAcolyte wrote: |
---|
Would yous similarly say that looping based on vector size was bad practice, or is this determination limited to eof? |
Yep, I would similarly say that looping on vector size is bad practice. Utilize iterators instead, or the range-based for-loop when possible.
I suppose I'm yet learning the use of iterators, equally vectors are fairly new to me and I found this usage simpler. I'll explore this outcome further in another thread if I accept trouble grasping it. Thanks for all the feedback, LB. It helped me a lot, and likely benefited Darth as well.
Last edited on
Topic archived. No new replies allowed.
Source: https://www.cplusplus.com/forum/beginner/121657/
0 Response to "Reading a File Where Some Have 2 Words"
Post a Comment