GUILE 2/3 and string encoding cost

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

GUILE 2/3 and string encoding cost

Han-Wen Nienhuys-3
I looked a bit through the GUILE source code to see what is going on.

I believe our current hypothesis (LilyPond's slowdown is caused by
expensive unicode transcoding into 32-bit strings) is incorrect.

If you look into the source code, you can see that the UTF-8 -> SCM
conversion checks if there are any code points over 255


https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620

if there aren't, it uses Latin1 encoding ("narrow == 1") to encode the
string as a normal byte array. This code walks the string twice, but that
is very cheap due to CPU cache locality, so it should be
essentially equivalent to whatever GUILE 1.8 was doing.

The conversion in the other direction is here:
https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n2065

as you can see, if the string is narrow (Latin1/ASCII), it uses the cheap
path as well.

LilyPond internally doesn't use any Unicode strings, as all our identifiers
are pure ascii, as well as internal strings (eg. font glyph names). This
means that files that do not use Unicode characters at all should have the
same overhead for strings as GUILE 1.8.

Even so, if the input flie does use UTF-8, there should be little overhead,
because the number of texts that we process is always small. LilyPond is
not a text processor.

So, what hard data do we have on GUILE 2/3 slowness, and what does that
data say?

--
Han-Wen Nienhuys - [hidden email] - http://www.xs4all.nl/~hanwen
dak
Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

dak
Han-Wen Nienhuys <[hidden email]> writes:

> I looked a bit through the GUILE source code to see what is going on.
>
> I believe our current hypothesis (LilyPond's slowdown is caused by
> expensive unicode transcoding into 32-bit strings) is incorrect.
>
> If you look into the source code, you can see that the UTF-8 -> SCM
> conversion checks if there are any code points over 255
>
>
> https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620
>
> if there aren't, it uses Latin1 encoding ("narrow == 1") to encode the
> string as a normal byte array. This code walks the string twice, but that
> is very cheap due to CPU cache locality, so it should be
> essentially equivalent to whatever GUILE 1.8 was doing.

GUILE 1.8 did not walk the string even once.

> LilyPond internally doesn't use any Unicode strings, as all our
> identifiers are pure ascii, as well as internal strings (eg. font
> glyph names). This means that files that do not use Unicode characters
> at all should have the same overhead for strings as GUILE 1.8.

We already use the latin1 calls for LilyPond internals.

> Even so, if the input flie does use UTF-8, there should be little
> overhead, because the number of texts that we process is always
> small. LilyPond is not a text processor.
>
> So, what hard data do we have on GUILE 2/3 slowness, and what does
> that data say?

That data says "humongous slowdown".  There is not much more than
speculation what this is caused by as far as I know.

--
David Kastrup

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Thomas Morley-2
Am Mi., 22. Jan. 2020 um 12:02 Uhr schrieb David Kastrup <[hidden email]>:
>
> Han-Wen Nienhuys <[hidden email]> writes:

> > So, what hard data do we have on GUILE 2/3 slowness, and what does
> > that data say?
>
> That data says "humongous slowdown".  There is not much more than
> speculation what this is caused by as far as I know.

I can't provide any insight here.
Though, once I have a working LilyPond/guile-3, I'll test how
"humongous" it will.

A working LilyPond/guile-3 means: successfull make, make doc, make
test-baseline.
Currently only the first is done.

I have to run for my regular job now, probably more in the (late) evening.


Cheers,
  Harm

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Han-Wen Nienhuys-3
In reply to this post by dak
On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <[hidden email]> wrote:

> Han-Wen Nienhuys <[hidden email]> writes:
>
> > I looked a bit through the GUILE source code to see what is going on.
> >
> > I believe our current hypothesis (LilyPond's slowdown is caused by
> > expensive unicode transcoding into 32-bit strings) is incorrect.
> >
> > If you look into the source code, you can see that the UTF-8 -> SCM
> > conversion checks if there are any code points over 255
> >
> >
> >
> https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620
> >
> > if there aren't, it uses Latin1 encoding ("narrow == 1") to encode the
> > string as a normal byte array. This code walks the string twice, but that
> > is very cheap due to CPU cache locality, so it should be
> > essentially equivalent to whatever GUILE 1.8 was doing.
>
> GUILE 1.8 did not walk the string even once
>

GUILE 1.8 walks it once when you do memcpy.


> > Even so, if the input flie does use UTF-8, there should be little
> > overhead, because the number of texts that we process is always
> > small. LilyPond is not a text processor.
> >
> > So, what hard data do we have on GUILE 2/3 slowness, and what does
> > that data say?
>
> That data says "humongous slowdown".  There is not much more than
> speculation what this is caused by as far as I know.
>
>
Do we have a standardized test file for benchmarking performance?

--
Han-Wen Nienhuys - [hidden email] - http://www.xs4all.nl/~hanwen
dak
Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

dak
Han-Wen Nienhuys <[hidden email]> writes:

> On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <[hidden email]> wrote:
>
>> Han-Wen Nienhuys <[hidden email]> writes:
>>
>> > I looked a bit through the GUILE source code to see what is going on.
>> >
>> > I believe our current hypothesis (LilyPond's slowdown is caused by
>> > expensive unicode transcoding into 32-bit strings) is incorrect.
>> >
>> > If you look into the source code, you can see that the UTF-8 -> SCM
>> > conversion checks if there are any code points over 255
>> >
>> >
>> >
>> https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620
>> >
>> > if there aren't, it uses Latin1 encoding ("narrow == 1") to encode the
>> > string as a normal byte array. This code walks the string twice, but that
>> > is very cheap due to CPU cache locality, so it should be
>> > essentially equivalent to whatever GUILE 1.8 was doing.
>>
>> GUILE 1.8 did not walk the string even once
>>
>
> GUILE 1.8 walks it once when you do memcpy.

Ok, but that's sort of a cheap walk.

>> > Even so, if the input flie does use UTF-8, there should be little
>> > overhead, because the number of texts that we process is always
>> > small. LilyPond is not a text processor.
>> >
>> > So, what hard data do we have on GUILE 2/3 slowness, and what does
>> > that data say?
>>
>> That data says "humongous slowdown".  There is not much more than
>> speculation what this is caused by as far as I know.
>>
>>
> Do we have a standardized test file for benchmarking performance?

input/regression/mozart-hrn-3.ly possibly, but it's not particularly
large.

--
David Kastrup

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Carl Sorensen-3


On 1/22/20, 1:21 PM, "lilypond-devel on behalf of David Kastrup" <lilypond-devel-bounces+c_sorensen=[hidden email] on behalf of [hidden email]> wrote:

    Han-Wen Nienhuys <[hidden email]> writes:
   
    > On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <[hidden email]> wrote:
    >
    >> Han-Wen Nienhuys <[hidden email]> writes:
    >>
    >> > I looked a bit through the GUILE source code to see what is going on.
    >> >
    >> > I believe our current hypothesis (LilyPond's slowdown is caused by
    >> > expensive unicode transcoding into 32-bit strings) is incorrect.
    >> >
    >> > If you look into the source code, you can see that the UTF-8 -> SCM
    >> > conversion checks if there are any code points over 255
    >> >
    >> >
    >> >
    >> https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620
    >> >
    >> > if there aren't, it uses Latin1 encoding ("narrow == 1") to encode the
    >> > string as a normal byte array. This code walks the string twice, but that
    >> > is very cheap due to CPU cache locality, so it should be
    >> > essentially equivalent to whatever GUILE 1.8 was doing.
    >>
    >> GUILE 1.8 did not walk the string even once
    >>
    >
    > GUILE 1.8 walks it once when you do memcpy.
   
    Ok, but that's sort of a cheap walk.
   
    >> > Even so, if the input flie does use UTF-8, there should be little
    >> > overhead, because the number of texts that we process is always
    >> > small. LilyPond is not a text processor.
    >> >
    >> > So, what hard data do we have on GUILE 2/3 slowness, and what does
    >> > that data say?
    >>
    >> That data says "humongous slowdown".  There is not much more than
    >> speculation what this is caused by as far as I know.
    >>
    >>
    > Do we have a standardized test file for benchmarking performance?
   
    input/regression/mozart-hrn-3.ly possibly, but it's not particularly
    large.

We don't have a standardized test file, but we do have some representative results from a couple of (unknown but described) files:

https://lists.gnu.org/archive/html/lilypond-devel/2018-10/msg00054.html

Perhaps we could get those files to become standards (along with some other, shorter-compiling files).

Carl
 

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Urs Liska-3
Am Mittwoch, den 22.01.2020, 20:28 +0000 schrieb Carl Sorensen:

>
> On 1/22/20, 1:21 PM, "lilypond-devel on behalf of David Kastrup" <
> lilypond-devel-bounces+c_sorensen=[hidden email] on behalf of
> [hidden email]> wrote:
>
>     Han-Wen Nienhuys <[hidden email]> writes:
>    
>     > On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <[hidden email]>
> wrote:
>     >
>     >> Han-Wen Nienhuys <[hidden email]> writes:
>     >>
>     >> > I looked a bit through the GUILE source code to see what is
> going on.
>     >> >
>     >> > I believe our current hypothesis (LilyPond's slowdown is
> caused by
>     >> > expensive unicode transcoding into 32-bit strings) is
> incorrect.
>     >> >
>     >> > If you look into the source code, you can see that the UTF-8
> -> SCM
>     >> > conversion checks if there are any code points over 255
>     >> >
>     >> >
>     >> >
>     >>
> https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620
>     >> >
>     >> > if there aren't, it uses Latin1 encoding ("narrow == 1") to
> encode the
>     >> > string as a normal byte array. This code walks the string
> twice, but that
>     >> > is very cheap due to CPU cache locality, so it should be
>     >> > essentially equivalent to whatever GUILE 1.8 was doing.
>     >>
>     >> GUILE 1.8 did not walk the string even once
>     >>
>     >
>     > GUILE 1.8 walks it once when you do memcpy.
>    
>     Ok, but that's sort of a cheap walk.
>    
>     >> > Even so, if the input flie does use UTF-8, there should be
> little
>     >> > overhead, because the number of texts that we process is
> always
>     >> > small. LilyPond is not a text processor.
>     >> >
>     >> > So, what hard data do we have on GUILE 2/3 slowness, and
> what does
>     >> > that data say?
>     >>
>     >> That data says "humongous slowdown".  There is not much more
> than
>     >> speculation what this is caused by as far as I know.
>     >>
>     >>
>     > Do we have a standardized test file for benchmarking
> performance?
>    
>     input/regression/mozart-hrn-3.ly possibly, but it's not
> particularly
>     large.
>
> We don't have a standardized test file, but we do have some
> representative results from a couple of (unknown but described)
> files:
>
> https://lists.gnu.org/archive/html/lilypond-devel/2018-10/msg00054.html
>
> Perhaps we could get those files to become standards (along with some
> other, shorter-compiling files).
>

Not right now but in the not-so-distant future I'd be able¹ to provide
the 650 examples from the Mozart violin school as a set of many small
scores, which might be a nice complement to one large score.

Urs

¹ It's not about copyright (the edition is released under a CC) but
about being ready for that purpose.

> Carl
>  
>


Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Karlin High
In reply to this post by Han-Wen Nienhuys-3
On 1/22/2020 2:07 PM, Han-Wen Nienhuys wrote:
> Do we have a standardized test file for benchmarking performance?

I can't speak to "standardized," but I do remember some threads that had
benchmarking going on by various users, using large LilyPond projects.

The Robert Carver "Missa Dum Sacrum Mysterium" from Vaughan McAlley.
(114 pages)

<https://lists.gnu.org/archive/html/lilypond-user/2016-11/msg00700.html>

And the Heinrich Schütz "Schwanengesang" Psalm 119 from Brent Annable.
(194 pages)

<https://lists.gnu.org/archive/html/lilypond-user/2018-05/msg00211.html>
--
Karlin High
Missouri, USA

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Thomas Morley-2
In reply to this post by Thomas Morley-2
Am Mi., 22. Jan. 2020 um 12:12 Uhr schrieb Thomas Morley
<[hidden email]>:

>
> Am Mi., 22. Jan. 2020 um 12:02 Uhr schrieb David Kastrup <[hidden email]>:
> >
> > Han-Wen Nienhuys <[hidden email]> writes:
>
> > > So, what hard data do we have on GUILE 2/3 slowness, and what does
> > > that data say?
> >
> > That data says "humongous slowdown".  There is not much more than
> > speculation what this is caused by as far as I know.
>
> I can't provide any insight here.
> Though, once I have a working LilyPond/guile-3, I'll test how
> "humongous" it will.
>
> A working LilyPond/guile-3 means: successfull make, make doc, make
> test-baseline.
> Currently only the first is done.

I've now got a successful ´make LANGS='' doc´

´make test-baseline´ still fails, as it did with all recent guile-versions.
Though, I don't understand why it does so. Compiling the docs will
return the regression-tests as html/pdf already.
So all files are already successfully compiled.
What's the difference?

Next I'll apply Han-Wen's patch about ly_scm_write_string
https://sourceforge.net/p/testlilyissues/issues/5666/
(Earlier I suspected ly_scm_write_string to be the problem)


Cheers,
  Harm

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Thomas Morley-2
In reply to this post by Karlin High
Am Mi., 22. Jan. 2020 um 21:52 Uhr schrieb Karlin High <[hidden email]>:

>
> On 1/22/2020 2:07 PM, Han-Wen Nienhuys wrote:
> > Do we have a standardized test file for benchmarking performance?
>
> I can't speak to "standardized," but I do remember some threads that had
> benchmarking going on by various users, using large LilyPond projects.
>
> The Robert Carver "Missa Dum Sacrum Mysterium" from Vaughan McAlley.
> (114 pages)
>
> <https://lists.gnu.org/archive/html/lilypond-user/2016-11/msg00700.html>

I took that file and compiled it with 2.19.83 and LilyPond-2.21.0/guile-3
Although, I didn't apply convert-ly, which may be a problem or not.

The results:

2.19.83

1. run

real    3m38,352s
user    3m24,493s
sys    0m4,659s

2. run

real    3m33,635s
user    3m23,291s
sys    0m4,563s


2.21.0 with guile-3

1.run

real    9m41,223s
user    11m36,337s
sys    0m3,600s

2. run

real    9m25,902s
user    11m20,798s
sys    0m3,743s

Cheers,
  Harm

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Thomas Morley-2
Am Mi., 22. Jan. 2020 um 22:36 Uhr schrieb Thomas Morley
<[hidden email]>:

>
> Am Mi., 22. Jan. 2020 um 21:52 Uhr schrieb Karlin High <[hidden email]>:
> >
> > On 1/22/2020 2:07 PM, Han-Wen Nienhuys wrote:
> > > Do we have a standardized test file for benchmarking performance?
> >
> > I can't speak to "standardized," but I do remember some threads that had
> > benchmarking going on by various users, using large LilyPond projects.
> >
> > The Robert Carver "Missa Dum Sacrum Mysterium" from Vaughan McAlley.
> > (114 pages)
> >
> > <https://lists.gnu.org/archive/html/lilypond-user/2016-11/msg00700.html>
>
> I took that file and compiled it with 2.19.83 and LilyPond-2.21.0/guile-3

(NB: without patch for Issue 5666 applied)

> Although, I didn't apply convert-ly, which may be a problem or not.
>
> The results:
>
> 2.19.83
>
> 1. run
>
> real    3m38,352s
> user    3m24,493s
> sys    0m4,659s
>
> 2. run
>
> real    3m33,635s
> user    3m23,291s
> sys    0m4,563s
>
>
> 2.21.0 with guile-3
>
> 1.run
>
> real    9m41,223s
> user    11m36,337s
> sys    0m3,600s
>
> 2. run
>
> real    9m25,902s
> user    11m20,798s
> sys    0m3,743s
>
> Cheers,
>   Harm

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Han-Wen Nienhuys-3
In reply to this post by dak
On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <[hidden email]> wrote:

>
> > So, what hard data do we have on GUILE 2/3 slowness, and what does
> > that data say?
>
> That data says "humongous slowdown".  There is not much more than
> speculation what this is caused by as far as I know.
>

I can see the 2x slowdown, and it looks uniformly distributed over the
whole process. The GUILE 2.0 release

  https://lwn.net/Articles/428288/

has one big red flag for me.

  * Switch to the Boehm-Demers-Weiser garbage collector

  Guile now uses the Boehm-Demers-Weiser conservative garbage collector
  (aka. libgc).  It makes interaction with C code easier making, for
  instance, the use of mark and free SMOB procedures optional in many
  cases.  It also improves performance.

let me get out the profiler to see what is going on.

--
Han-Wen Nienhuys - [hidden email] - http://www.xs4all.nl/~hanwen
Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Han-Wen Nienhuys-3
On Wed, Jan 22, 2020 at 10:53 PM Han-Wen Nienhuys <[hidden email]> wrote:

>
>
> On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <[hidden email]> wrote:
>
>>
>> > So, what hard data do we have on GUILE 2/3 slowness, and what does
>> > that data say?
>>
>> That data says "humongous slowdown".  There is not much more than
>> speculation what this is caused by as far as I know.
>>
>
> I can see the 2x slowdown, and it looks uniformly distributed over the
> whole process. The GUILE 2.0 release
>

Actually, the I was comparing the -O2 build with the -O0 build.

When recompiling, the Scheme init (reading .scm files) takes 0.31s in 1.8
vs. 2.7s in 2.0, a 9x slowdown.





>   https://lwn.net/Articles/428288/
>
> has one big red flag for me.
>
>   * Switch to the Boehm-Demers-Weiser garbage collector
>
>   Guile now uses the Boehm-Demers-Weiser conservative garbage collector
>   (aka. libgc).  It makes interaction with C code easier making, for
>   instance, the use of mark and free SMOB procedures optional in many
>   cases.  It also improves performance.
>
> let me get out the profiler to see what is going on.
>
> --
> Han-Wen Nienhuys - [hidden email] - http://www.xs4all.nl/~hanwen
>


--
Han-Wen Nienhuys - [hidden email] - http://www.xs4all.nl/~hanwen
dak
Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

dak
Han-Wen Nienhuys <[hidden email]> writes:

> On Wed, Jan 22, 2020 at 10:53 PM Han-Wen Nienhuys <[hidden email]> wrote:
>
>>
>>
>> On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <[hidden email]> wrote:
>>
>>>
>>> > So, what hard data do we have on GUILE 2/3 slowness, and what does
>>> > that data say?
>>>
>>> That data says "humongous slowdown".  There is not much more than
>>> speculation what this is caused by as far as I know.
>>>
>>
>> I can see the 2x slowdown, and it looks uniformly distributed over the
>> whole process. The GUILE 2.0 release
>>
>
> Actually, the I was comparing the -O2 build with the -O0 build.
>
> When recompiling, the Scheme init (reading .scm files) takes 0.31s in 1.8
> vs. 2.7s in 2.0, a 9x slowdown.

The Guile-2 compiler is doing a lot of optimisations, and LilyPond's
startup code switches off byte compilation because the dependencies are
hard to get under control.  The current codebase at least manages to
avoid to compile code with as-yet undefined macros, something that
Guile-1.8 had no problems with but Guile-2.0 refuses.

So the Scheme loading speed is sort-of expected due to Guile relying on
byte compilation for speed and we switch it off.

--
David Kastrup

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Thomas Morley-2
In reply to this post by Thomas Morley-2
Am Mi., 22. Jan. 2020 um 22:32 Uhr schrieb Thomas Morley
<[hidden email]>:

>
> Am Mi., 22. Jan. 2020 um 12:12 Uhr schrieb Thomas Morley
> <[hidden email]>:
> >
> > Am Mi., 22. Jan. 2020 um 12:02 Uhr schrieb David Kastrup <[hidden email]>:
> > >
> > > Han-Wen Nienhuys <[hidden email]> writes:
> >
> > > > So, what hard data do we have on GUILE 2/3 slowness, and what does
> > > > that data say?
> > >
> > > That data says "humongous slowdown".  There is not much more than
> > > speculation what this is caused by as far as I know.
> >
> > I can't provide any insight here.
> > Though, once I have a working LilyPond/guile-3, I'll test how
> > "humongous" it will.
> >
> > A working LilyPond/guile-3 means: successfull make, make doc, make
> > test-baseline.
> > Currently only the first is done.
>
> I've now got a successful ´make LANGS='' doc´
>
> ´make test-baseline´ still fails, as it did with all recent guile-versions.
> Though, I don't understand why it does so. Compiling the docs will
> return the regression-tests as html/pdf already.
> So all files are already successfully compiled.
> What's the difference?
>
> Next I'll apply Han-Wen's patch about ly_scm_write_string
> https://sourceforge.net/p/testlilyissues/issues/5666/
> (Earlier I suspected ly_scm_write_string to be the problem)
>
>
> Cheers,
>   Harm

Regrettable applying the patch for Issue 5666 doesn't solve the
problem, ´make test-baseline´ still errors, without any useful
message.

Cheers,
  Harm

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Han-Wen Nienhuys-3
In reply to this post by dak
On Wed, Jan 22, 2020 at 11:43 PM David Kastrup <[hidden email]> wrote:

> > Actually, the I was comparing the -O2 build with the -O0 build.
> >
> > When recompiling, the Scheme init (reading .scm files) takes 0.31s in 1.8
> > vs. 2.7s in 2.0, a 9x slowdown.
>
> The Guile-2 compiler is doing a lot of optimisations, and LilyPond's
> startup code switches off byte compilation because the dependencies are
> hard to get under control.


where does this happen?

The current codebase at least manages to
> avoid to compile code with as-yet undefined macros, something that
> Guile-1.8 had no problems with but Guile-2.0 refuses.
>

Do you mean that we don't have them anymore, or is there something else
going on?


>
> So the Scheme loading speed is sort-of expected due to Guile relying on
> byte compilation for speed and we switch it off.
>
>
Much to the contrary. Byte-compiling is slow (but running it should be
faster), something you can see from building guile. If it is switched off,
we are getting the "fast" experience.

--
Han-Wen Nienhuys - [hidden email] - http://www.xs4all.nl/~hanwe
<http://www.xs4all.nl/~hanwen>?
dak
Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

dak
Han-Wen Nienhuys <[hidden email]> writes:

> On Wed, Jan 22, 2020 at 11:43 PM David Kastrup <[hidden email]> wrote:
>
>> > Actually, the I was comparing the -O2 build with the -O0 build.
>> >
>> > When recompiling, the Scheme init (reading .scm files) takes 0.31s in 1.8
>> > vs. 2.7s in 2.0, a 9x slowdown.
>>
>> The Guile-2 compiler is doing a lot of optimisations, and LilyPond's
>> startup code switches off byte compilation because the dependencies are
>> hard to get under control.
>
>
> where does this happen?

lily/main.cc:     sane_putenv("GUILE_AUTO_COMPILE", "0", true);  // disable auto-compile

Took me a while to find again.

>> The current codebase at least manages to
>> avoid to compile code with as-yet undefined macros, something that
>> Guile-1.8 had no problems with but Guile-2.0 refuses.
>>
>
> Do you mean that we don't have them anymore, or is there something else
> going on?

The source code has been rearranged so that macros are defined before use.

>> So the Scheme loading speed is sort-of expected due to Guile relying on
>> byte compilation for speed and we switch it off.
>>
>>
> Much to the contrary. Byte-compiling is slow (but running it should be
> faster), something you can see from building guile. If it is switched off,
> we are getting the "fast" experience.

There is a difference between file-level compilation and other stuff.
But I don't know the details well.

--
David Kastrup

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Han-Wen Nienhuys-3
In reply to this post by Han-Wen Nienhuys-3
On Wed, Jan 22, 2020 at 10:53 PM Han-Wen Nienhuys <[hidden email]> wrote:

> The GUILE 2.0 release
>
>   https://lwn.net/Articles/428288/
>
> has one big red flag for me.
>
>   * Switch to the Boehm-Demers-Weiser garbage collector
>

We can easily measure this, by adding the following to

#(display (version))
#(display " gc time taken: ")
#(display (* 1.0 (/ (cdr (assoc 'gc-time-taken (gc-stats)))
internal-time-units-per-second)))
#(display "\n")

on mozart-hrn-3, over 3 runs, we get

2.0.14  - avg 2.1s
1.8.8 - avg 0.31s

so the new GC is about 5-10x slower than the old one. With GUILE 1.8,
garbage collection covers typically is 10% of the runtime, so all things
equal, the Boehm GC would cause a 1.5-2.0x slowdown in the total.

It would be good to see how the JITting of code impacts Scheme execution.

--
Han-Wen Nienhuys - [hidden email] - http://www.xs4all.nl/~hanwen
dak
Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

dak
Han-Wen Nienhuys <[hidden email]> writes:

> On Wed, Jan 22, 2020 at 10:53 PM Han-Wen Nienhuys <[hidden email]> wrote:
>
>> The GUILE 2.0 release
>>
>>   https://lwn.net/Articles/428288/
>>
>> has one big red flag for me.
>>
>>   * Switch to the Boehm-Demers-Weiser garbage collector
>>
>
> We can easily measure this, by adding the following to
>
> #(display (version))
> #(display " gc time taken: ")
> #(display (* 1.0 (/ (cdr (assoc 'gc-time-taken (gc-stats)))
> internal-time-units-per-second)))
> #(display "\n")
>
> on mozart-hrn-3, over 3 runs, we get
>
> 2.0.14  - avg 2.1s
> 1.8.8 - avg 0.31s
>
> so the new GC is about 5-10x slower than the old one. With GUILE 1.8,
> garbage collection covers typically is 10% of the runtime, so all things
> equal, the Boehm GC would cause a 1.5-2.0x slowdown in the total.
>
> It would be good to see how the JITting of code impacts Scheme
> execution.

Boehm GC can work in a background thread I think.  And Guile-v2
applications typically just let all their data be treated as pointers
rather than using a smob-marking algorithm like we do, and it is
conceivable that Boehm GC's individual mark function does not scale.

However, considering everything a pointer for a 32bit application that
can eat a significant ratio of the total address space is a nightmare:
there would be just too much memory pinned down due to conservative
garbage collection.

On a 64bit application, this would be somewhat more tenable, but we'd
need to override operator new for smobs.

Or do we?  Maybe the heap is collected by default, and we need to switch
that off?

--
David Kastrup

Reply | Threaded
Open this post in threaded view
|

Re: GUILE 2/3 and string encoding cost

Han-Wen Nienhuys-3
On Thu, Jan 23, 2020 at 10:39 PM David Kastrup <[hidden email]> wrote:

>
> > on mozart-hrn-3, over 3 runs, we get
> >
> > 2.0.14  - avg 2.1s
> > 1.8.8 - avg 0.31s
> >
> > so the new GC is about 5-10x slower than the old one. With GUILE 1.8,
> > garbage collection covers typically is 10% of the runtime, so all things
> > equal, the Boehm GC would cause a 1.5-2.0x slowdown in the total.
> >
> > It would be good to see how the JITting of code impacts Scheme
> > execution.
>
> Boehm GC can work in a background thread I think.  And Guile-v2
> applications typically just let all their data be treated as pointers
> rather than using a smob-marking algorithm like we do, and it is
> conceivable that Boehm GC's individual mark function does not scale.
>

Do you mean our mechanism to call user-defined mark functions? I doubt that
there are obvious BGC scalability problems in BGC's mark functoin.

However, considering everything a pointer for a 32bit application that
> can eat a significant ratio of the total address space is a nightmare:
> there would be just too much memory pinned down due to conservative
> garbage collection.
>
>
GUILE 1.8 already scanned the stack conservatively, so large scores would
probably never work on 32 bits. Was this a concern in the past?  How do
score sizes (in pages) translates to memory usage (in megabytes)?

I think it is reasonable for us to start assuming people run lilypond on a
64-bit machines.


> On a 64bit application, this would be somewhat more tenable, but we'd
> need to override operator new for smobs.
>
> Or do we?  Maybe the heap is collected by default, and we need to switch
> that off?
>
>
What do you mean with "heap is collected"?


--
Han-Wen Nienhuys - [hidden email] - http://www.xs4all.nl/~hanwen
12