NSMutableString -initWithFormat appends to existing text

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

NSMutableString -initWithFormat appends to existing text

Michele Bert
Hello.
Can I make question about how to use API, in this list?
I decided to use GNUstep for all the working activity where I have the
chance, just to practice my obj-c/gnustep experience.

Well, in a method I am using and re-using an object of
NSMutableString, so I create it once with +stringWithCapacity, an than
I initialize with -initWithFormat. Something like:

  NSMutableString *outmsg=[NSMutableString stringWithCapacity: 1000];
  outmsg=[outmsg initWithFormat: @"first: %@", [arr objectAtIndex: 0]];
  // use outmsg
  outmsg=[outmsg initWithFormat: @"second: %@", [arr objectAtIndex: 1]];
  // use outmsg
 outmsg=[outmsg initWithFormat: @"third: %@", [arr objectAtIndex: 2]];
  // use outmsg

What I observe is that every time I invoke -initWithFormat the text is
appended at the end of the existing string. Is it the right behavior
to expect? Am I doing something wrong?
--
Mick

_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

H. Nikolaus Schaller

> Am 18.04.2018 um 09:42 schrieb Mick Bert <[hidden email]>:
>
> Hello.
> Can I make question about how to use API, in this list?
> I decided to use GNUstep for all the working activity where I have the
> chance, just to practice my obj-c/gnustep experience.
>
> Well, in a method I am using and re-using an object of
> NSMutableString, so I create it once with +stringWithCapacity, an than
> I initialize with -initWithFormat. Something like:
>
>  NSMutableString *outmsg=[NSMutableString stringWithCapacity: 1000];
>  outmsg=[outmsg initWithFormat: @"first: %@", [arr objectAtIndex: 0]];
>  // use outmsg
>  outmsg=[outmsg initWithFormat: @"second: %@", [arr objectAtIndex: 1]];
>  // use outmsg
> outmsg=[outmsg initWithFormat: @"third: %@", [arr objectAtIndex: 2]];
>  // use outmsg
>
> What I observe is that every time I invoke -initWithFormat the text is
> appended at the end of the existing string. Is it the right behavior
> to expect? Am I doing something wrong?

-init or -initWithFormat: should be called only once and only after +alloc.
You use stringWithCapacity which has already been initialized.

Use -setString:@"" followed by -appendWithFormat: instead.

-- hns

_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

David Chisnall-7
In reply to this post by Michele Bert
On 18 Apr 2018, at 08:42, Mick Bert <[hidden email]> wrote:

>
> Hello.
> Can I make question about how to use API, in this list?
> I decided to use GNUstep for all the working activity where I have the
> chance, just to practice my obj-c/gnustep experience.
>
> Well, in a method I am using and re-using an object of
> NSMutableString, so I create it once with +stringWithCapacity, an than
> I initialize with -initWithFormat. Something like:
>
>  NSMutableString *outmsg=[NSMutableString stringWithCapacity: 1000];
>  outmsg=[outmsg initWithFormat: @"first: %@", [arr objectAtIndex: 0]];
>  // use outmsg
>  outmsg=[outmsg initWithFormat: @"second: %@", [arr objectAtIndex: 1]];
>  // use outmsg
> outmsg=[outmsg initWithFormat: @"third: %@", [arr objectAtIndex: 2]];
>  // use outmsg
>
> What I observe is that every time I invoke -initWithFormat the text is
> appended at the end of the existing string. Is it the right behavior
> to expect? Am I doing something wrong?

That’s not what I would expect to happen, but in general Objective-C considers calling an init-family method on the same object to be undefined behaviour unless the class explicitly advertises support for this.  Most init-family methods assume that all ivars are zero on entry and so will either leak memory or do very odd things if you call them on an already-initialised object.

It’s somewhat unfortunate that Objective-C has +alloc and -dealloc, but doesn’t have a -deinit method to correspond to -init and reset an object into a state where it is safe to reinitialise it, though with the performance of modern memory allocators you don’t actually save very much by skipping reallocation in most cases.  The few cases where it does make sense are usually done by a -reset or similar method that efficiently restores an object to a pristine state.

Note that, in your example, the +stringWithCapacity: method is expected to call an init-family method and so any call to any other initialiser is likely to be undefined.

David


_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

Michele Bert
In reply to this post by H. Nikolaus Schaller
2018-04-18 9:52 GMT+02:00 H. Nikolaus Schaller <[hidden email]>:
>
> -init or -initWithFormat: should be called only once and only after +alloc.
> You use stringWithCapacity which has already been initialized.
>
> Use -setString:@"" followed by -appendWithFormat: instead.
>
Thanks a lot!
Now it works as expected.

One question more. I am writing in a text file, by formatting NSString
objects to give to NSFileHandle -writeData.
Is it the preferable way? Are there any other class to work with
text-oriended files?

--
Mick

_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

Ivan Vučica-2


On 18 Apr 2018, at 09:23, Mick Bert <[hidden email]> wrote:

2018-04-18 9:52 GMT+02:00 H. Nikolaus Schaller <[hidden email]>:

-init or -initWithFormat: should be called only once and only after +alloc.
You use stringWithCapacity which has already been initialized.

Use -setString:@"" followed by -appendWithFormat: instead.

Thanks a lot!
Now it works as expected.

It may, but invoking init multiple times is still a bad idea.


One question more. I am writing in a text file, by formatting NSString
objects to give to NSFileHandle -writeData.
Is it the preferable way? Are there any other class to work with
text-oriended files?

If you don’t need anything more, NSString includes writeToFile:.

What does a ‘text oriented file’ mean? You still have a format, whether that’s Windows-style INI, or JSON, or the ASCII and XML property list formats that are first-class citizens under GNUstep and similar systems? 

For portability of data to other languages, I’d say writing JSON is simplest. NSJSONSerialization is what you want.



_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

Michele Bert
2018-04-18 11:29 GMT+02:00 Ivan Vučica <[hidden email]>:

>
>
> On 18 Apr 2018, at 09:23, Mick Bert <[hidden email]> wrote:
>
> 2018-04-18 9:52 GMT+02:00 H. Nikolaus Schaller <[hidden email]>:
>
>
> -init or -initWithFormat: should be called only once and only after +alloc.
> You use stringWithCapacity which has already been initialized.
>
> Use -setString:@"" followed by -appendWithFormat: instead.
>
> Thanks a lot!
> Now it works as expected.
>
>
> It may, but invoking init multiple times is still a bad idea.
>
I mean: now, that I applied the above suggestion, it works ^__^

>
> One question more. I am writing in a text file, by formatting NSString
> objects to give to NSFileHandle -writeData.
> Is it the preferable way? Are there any other class to work with
> text-oriended files?
>
>
> If you don’t need anything more, NSString includes writeToFile:.
>
> What does a ‘text oriented file’ mean?

With test-oriented i mean generic text streams, organized as a sequence of
variable length lines, separated by new-line character. Very generic readable
text.

I cannot load the entire file in memory, because it is too big. Thus I
process it
line by line, writing the output in an other file.

--
Mick

_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

Riccardo Mottola-5
Hi Mick,


Mick Bert wrote:
> I mean: now, that I applied the above suggestion, it works ^__^

the mantra is: allocate and init once.
For static classes, that's also the only thing you can do (e.g. NSArray,
NSString....) for mutable objects, then you need to use mutation methods
(e.g. NSMutableString, NSMutableArray....)


> With test-oriented i mean generic text streams, organized as a
> sequence of
> variable length lines, separated by new-line character. Very generic readable
> text.
>
> I cannot load the entire file in memory, because it is too big. Thus I
> process it
> line by line, writing the output in an other file.

You can write the stream directly, it may work for you, also for quite
big data, it is efficient.

Otherwise indeed, work with NSFileHandle

https://www.gnu.org/software/gnustep/resources/documentation/Developer/Base/Reference/NSFileHandle.html


Riccardo

_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

Michele Bert
2018-04-18 15:00 GMT+02:00 Riccardo Mottola <[hidden email]>:
> Hi Mick,
>
>
> You can write the stream directly, it may work for you, also for quite big
> data, it is efficient.
>
What do you mean by "write the stream directly"?

>
> Otherwise indeed, work with NSFileHandle
>
> https://www.gnu.org/software/gnustep/resources/documentation/Developer/Base/Reference/NSFileHandle.html
>

How about reading per-line?
Have I to read a chunk of file, and look by myself for the line
terminator? Aren't there something like the std::getline() of c++?

--
Mick

_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

David Chisnall-7
In reply to this post by Michele Bert
On 18 Apr 2018, at 10:23, Mick Bert <[hidden email]> wrote:
>
> One question more. I am writing in a text file, by formatting NSString
> objects to give to NSFileHandle -writeData.

That’s a sensible thing to do.  The reason for going via NSData is that NSString itself is representation agnostic.  When you generate an NSData, you are explicitly telling the string how to encode the characters and transforming it into a concrete representation (which may or may not be the same encoding as the in-memory representation used by the NSString class).

> Is it the preferable way?

That depends a bit on what you mean by ‘preferable’.  If you mean ‘simpler’ or ‘cleaner code’, then I don’t think so.  If you mean ‘faster’ or ‘lower memory’, then you may find that using NSString’s -getBytes:… method (I forget exactly which one GNUstep provides - I need to add the version Apple now provides) into an on-stack buffer then write that using the lower-level C / C++ APIs.  In most cases, the overhead of the I/O is going to be sufficient that this won’t make a noticeable difference, but if you’re processing a lot of data and have NVMe storage then you might consider this.

> Are there any other class to work with
> text-oriended files?

Note that text-oriented files don’t really exist as an abstraction on most *NIX systems (though the C standard still likes to pretend that they do).  GNUstep / Cocoa don’t provide useful abstractions for this, though the C++ standard streams library does (not very good ones though, so I don’t really suggest using it).

David


_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

Riccardo Mottola-5
In reply to this post by Michele Bert
Hi,

Mick Bert wrote:
>
> What do you mean by "write the stream directly"?

you dump the whole string directly to a path. Easy and fast!

>
>> Otherwise indeed, work with NSFileHandle
>>
>> https://www.gnu.org/software/gnustep/resources/documentation/Developer/Base/Reference/NSFileHandle.html
>>
> How about reading per-line?
> Have I to read a chunk of file, and look by myself for the line
> terminator? Aren't there something like the std::getline() of c++?

As far as I know, no. you can read per character, per bytes, by sized
chunks. But not per dat. You could write an extension yourself if needed.

However, a very quick way is to slurp the whole file in a string and
split in an NSArray with -componentsSeparatedByString:
I have used that with files of GBytes, provided your system has memory.

My suggestion is to try to use that and start writing your program and
then optimize later, if you don't know already that you need to work on
extremely big files that you can handle.

Riccardo

_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

David Chisnall-7
On 18 Apr 2018, at 16:28, Riccardo Mottola <[hidden email]> wrote:
>
> As far as I know, no. you can read per character, per bytes, by sized chunks. But not per dat. You could write an extension yourself if needed.
>
> However, a very quick way is to slurp the whole file in a string and split in an NSArray with -componentsSeparatedByString:
> I have used that with files of GBytes, provided your system has memory.

It’a also worth noting that, on a 64-bit system, it can sometimes be very efficient to mmap the entire file and then lazily construct NSString instances that use the provided buffer.  The file won’t all be read if you don’t have enough memory, and when memory becomes scarce the underlying string storage will be evicted and re-read on demand, though the NSString instances will remain around.

David


_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

Michele Bert
In reply to this post by David Chisnall-7
2018-04-18 17:15 GMT+02:00 David Chisnall <[hidden email]>:
> On 18 Apr 2018, at 10:23, Mick Bert <[hidden email]> wrote:
>>
>> Is it the preferable way?
>

> That depends a bit on what you mean by ‘preferable’.  If you mean
> ‘simpler’ or ‘cleaner code’, then I don’t think so.  If you mean
> ‘faster’ or ‘lower memory’, then you may find that using NSString’s
> -getBytes:… method into an on-stack buffer then write that using the
> lower-level C / C++ APIs.  In most cases, the overhead of the I/O is
> going to be sufficient that this won’t make a noticeable difference,
> but if you’re processing a lot of data and have NVMe storage then
> you might consider this.


Sometimes I have to process files several dozen of GByte large, 200
bilions of lines (it took a couple of minutes just to cont them :-D ).
I have successfully written perl scripts to process them, and it was
interesting. Now I would like to do it in a gnustep tool, just to
practice with base classes, and the language itself.

>> Are there any other class to work with
>> text-oriended files?
>
> Note that text-oriented files don’t really exist as an abstraction
> on most *NIX systems (though the C standard still likes to pretend
> that they do).  GNUstep / Cocoa don’t provide useful abstractions for
> this, though the C++ standard streams library does (not very good
> ones though, so I don’t really suggest using it).  David

Here I don't follow you any more. Whenever I have to write information
in a file, I always prefer readable form, so that I can access them
with a text editor, without the need of any particular tool (of any
particular version). At least as long as performance are concernd
(i.e. randomly seeking is needed, or syntax interpretation
is computationally too  heavy)

--
Mick

_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep
Reply | Threaded
Open this post in threaded view
|

Re: NSMutableString -initWithFormat appends to existing text

David Chisnall-7
On 19 Apr 2018, at 07:56, Mick Bert <[hidden email]> wrote:

>
> 2018-04-18 17:15 GMT+02:00 David Chisnall <[hidden email]>:
>> On 18 Apr 2018, at 10:23, Mick Bert <[hidden email]> wrote:
>>>
>>> Is it the preferable way?
>>
>
>> That depends a bit on what you mean by ‘preferable’.  If you mean
>> ‘simpler’ or ‘cleaner code’, then I don’t think so.  If you mean
>> ‘faster’ or ‘lower memory’, then you may find that using NSString’s
>> -getBytes:… method into an on-stack buffer then write that using the
>> lower-level C / C++ APIs.  In most cases, the overhead of the I/O is
>> going to be sufficient that this won’t make a noticeable difference,
>> but if you’re processing a lot of data and have NVMe storage then
>> you might consider this.
>
>
> Sometimes I have to process files several dozen of GByte large, 200
> bilions of lines (it took a couple of minutes just to cont them :-D ).

That implies that your storage is quite slow.  On modern NVMe storage, I’d expect to be able to process a few TBs in that time.  As such, it’s not worth optimising the CPU side too much, because you’re mostly waiting for I/O.  

> I have successfully written perl scripts to process them, and it was
> interesting. Now I would like to do it in a gnustep tool, just to
> practice with base classes, and the language itself.

It’s probably good practice, but don’t expect to see much of a speedup, if any (though you might see the CPU load and temperature go down a bit).

>>> Are there any other class to work with
>>> text-oriended files?
>>
>> Note that text-oriented files don’t really exist as an abstraction
>> on most *NIX systems (though the C standard still likes to pretend
>> that they do).  GNUstep / Cocoa don’t provide useful abstractions for
>> this, though the C++ standard streams library does (not very good
>> ones though, so I don’t really suggest using it).  David
>
> Here I don't follow you any more. Whenever I have to write information
> in a file, I always prefer readable form, so that I can access them
> with a text editor, without the need of any particular tool (of any
> particular version). At least as long as performance are concernd
> (i.e. randomly seeking is needed, or syntax interpretation
> is computationally too  heavy)

Some file systems have a concept of a text file as a distinct thing from a binary file.  The low-level APIs handle things like character set conversions, breaking into records, and so on.  The C and Windows APIs and, to a lesser extent, even POSIX have some vestiges of this, but most modern systems don’t differentiate files at that layer: they’re just files and it’s up to the reader to understand them.

Off topic now, but note that one of the down sides of not having a format that supports random access is that it is very difficult to process in parallel.  Given your system, I’d also recommend looking at compressing the data on disk.  For the traces that our processor generates, I moved from a human-readable text representation to a structured binary format that can be stored xz-compressed.  Even on machines with relatively fast (non-NVMe) flash storage, the code that reads the binary format and generates human-readable text can stream a moderately large trace (around 100GB in text format) to /dev/null faster than cat can do the same with the text file.  In both cases, disk I/O is the limiting factor.  The xz decompression can run in a separate thread to the processing (so can run on ahead filling up a buffer to process as fast as the disk can give you data, until the consumer catches up) and on a moderately fast CPU can decompress faster than the SSD can give you data.

David


_______________________________________________
Discuss-gnustep mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep