Improve startup cache loading performance#9307
Conversation
0047b0a to
34d1d39
Compare
34d1d39 to
e28b5f1
Compare
|
tested a few failure modes. e.g using a corrupted cache or one from an older NB version and it got reset on start. Detailse.g cache containing invalid data would show: but there are various checks before that so it would fail silently in most cases without even getting to the point where it is loaded. |
- use Data*Streams instead of Object*Streams - simplify file format (no XML) - code renovation and other minor optimizations results in 4x faster cache loading times (readCache() and calculateParents() methods) minor: - ModuleManager.EnableContext: List -> Set for field solely used for contains() in inner loop - Module: data loading can synchronize on instance instead of class (DATA_LOCK no longer static)
e28b5f1 to
54c0790
Compare
|
Nice work! |
| return deps; | ||
| } | ||
|
|
||
| /** |
There was a problem hiding this comment.
This is an exported class - before this change an instance of Dependency could only be created through Dependency#create now there is a option to inject a DataInput. Is there an option to keep this an implementation detail?
The created format does not look like something I'd like people to potentially rely on.
There was a problem hiding this comment.
the class itself can be only constructed from the inside. The create(String body) factory method is essentially a parsing constructor and can be fed Strings like org.netbeans.bootstrap/1 to create dependencies (which is done in some cases).
Even Dependency.create(Dependency.TYPE_MODULE, "hugo"); would already return a dependency.
The first impl for performance measurements had the constructor made public and the new read/write() methods were outside. However, the code was also a bit less maintainable since read/write() are used from two different places:
org.netbeans.ModuleData and org.netbeans.core.startup.ModuleList both use read()/write()
I couldn't come up with a better solution, since the goal is to avoid using ObjectOutputStream. Using read/writeObject actually performed worse and produced larger files, Externalizable would require a public no-arg constructor so I discarded that thought early + I believe it would also have required to keep ObjectOutputStream. (both cases would also have public methods)
Having them package private would require accessor utilities in two modules using the same namespace. Also not pretty. Maybe a comment that the methods are only meant for short term storage would be sufficient?
There was a problem hiding this comment.
package private methods in Dependency + pasting this into the two modules is probably worse than a comment on the methods?
package org.openide.modules;
//...
public class DepAccessor {
public static Dependency read(DataInput is) throws IOException {
return Dependency.read(is);
}
public static void write(Dependency dep, DataOutput os) throws IOException {
dep.write(os);
}
}| .map(e -> e.getKey() + "=" + e.getValue()) | ||
| .collect(Collectors.joining(",")); |
There was a problem hiding this comment.
This will break one either a key or a value contains = or ,.
There was a problem hiding this comment.
yes but can this happen? The set of keys is restricted to (see readProps):
case "name", "jar" -> entry[1];
case "enabled", "autoload", "eager", "reloadable" -> Boolean.valueOf(entry[1]);
case "startlevel" -> Integer.valueOf(entry[1]);
default -> throw new IOException("unknown key " + entry[0]);the first two are String values, the rest are primitives.
This does not cache all properties, only a very limited set. (this was already done before this change)
Data*Streamsinstead ofObject*Streamsresult are about 4x faster cache loading times (writing probably too but its not relevant for UX)
minor:
List->Setfor field solely used forcontains()in inner loopin numbers:
or see the
readCache()andcalculateParents()marked in purplebefore:

after:

third PR for some more startup optimizations pending. (first was #9303)