Entity Framework Vs Json Queries

Data Model

I'm using the below structure to mimic what an ad hock query builder could use for testing user created queries.

The structure for testing searching where each bit of information is saved as a separate row in a table.

The structure for testing searching with the information stored in JSON.

Note: The dynamic table and json structures are separated into different models: Media_Dynamic and Media_Json

Sad Note: At some point Media_Dynamic started getting referred to as just media. While Media_Json is referred to as json. Tying to get this cleaned up.

Test Data

I'm using Bogus with a hardcoded seed value to easily create lots of consistent semi-realistic looking data to add into the above data structures. Media_Dynamic is created then a selection of DynamicFields are used to generate data for the media item into DynamicMediaInformation. When all items for Media_Dynamic are done, this information is duplicated into Media_Json to reduce variables when testing search patterns.

Links to two large bacpac files have been included that were used to benchmark these search patterns: ef_testing_index_large.bacpac and ef_testing_string_large.bacpac which can be found here. The first uses a generated list of 50 different fields including the types: int, bool, date, string. The second is 40 fields of only strings with two additional int fields. Both have 100,000 generated rows of data. A new set can be generated with your own seed and data counts using CreateBogusData.cs

CreateBogusData.cs

The field FakerSeed is the seed used for generating the bogus data.

LoadAllData Can be called to fill the whole database in one go. fieldsCount Will be the list of available dynamic fields each row has available with random data types and sets if the fields are required or optional. mediaItemsCount Is the total number of rows to be able to search over which will be created for one table structure and duplicated into the other. listTypeCount For generating some dropdown types and the list of options available for each.

LoadSharedData Loads the data for the ad hock query builder portion of the database. Filling out the available fields a row can contain. The parameters are the same as above.

LoadMediaData Loads the data for the Dynamic and Json tables for running searches against. First it will create data for the dynamic table, then copy that data into the json table.

Some extra private "large" versions of methods are included that I used to generate the massive bacpac files that have been included. These were only temporarily called and are not currently used. (Needed to pause regularly for my laptop to cool down.)

Search Patterns

Multiple search pattern types were created to see what variants were most efficient. The intent is to see how each handles different search patterns, however all of the below are currently only setup to handle a basic AND search rather than the full query builder outlined above. For each pattern, unit tests were added to allow easily checking their validity along with how they each handle bad user input using NaughtyStrings. As benchmarking has progressed, more variants have been added and older ones deemed consistently worse than similar counterparts were moved out of the way into SearchServiceOld.cs.

Json Patterns

JsonSearch_Indexed_NoColumns
- Not a big change from JsonSearch_Indexed but a business decision changed the information needed to be brought back from the server.
JsonSearch_Indexed_NoColumns_StringAdjustment
- An attempt to see if searching on optional string fields could be sped up
JsonSearch_Indexed
- Similar to JsonSearch_Raw but with columns being indexed
- Note: the testing is setup to make an index for each field but the intention for a production environment would be to only add indexing to the needed fields.
JsonSearch_Raw
- Dynamically generated sql ran with ef
- Uses parametrized values to avoid sql injection.
JsonSearch_EfMagic
- Through custom EF functions, a query is generated.

Dynamic Patterns

TableSearch_Media_NoColumns
- Similar to TableSearch_Media but a lot faster with much less data to be returned.
- note: business decision changed the information needed to be brought back from the server.
TableSearch_Media
- Uses EF to search the database. Uses Includes to get the field information back from the server.
TableSearch_Media_RestrictedColumns
- Similar to TableSearch_Media but with the option to restrict how many columns are returned.
TableSearch_Media_SplitQuery
- Similar to TableSearch_Media but with the use of EF's SplitQuery function instead of Includes
TableSearch_Media_TwoQueries
- Similar to TableSearch_Media but intentionally splits the query into two queries. One to get the list of matching id's and the second to get the column information.

Running

Benchmarks can be setup in Program.cs file.

For running with a debugger:

BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly)
            .Run(args, new DebugInProcessConfig()
                .AddFilter(new AnyCategoriesFilter(new string[] { "miss" }))
                );

For running a proper benchmark test:

BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly)
            .Run(args, DefaultConfig.Instance
                //.AddFilter(new AnyCategoriesFilter(new string[] { "indexed" }))
                //.AddFilter(new AllCategoriesFilter(new string[] { "lessrand", "set1" }))
                .AddExporter(RPlotExporter.Default)
                .AddColumn(CategoriesColumn.Default)
                );

Run with the command dotnet run -c Release

Using AnyCategoriesFilter and AllCategoriesFilter can help narrow which tests can run when doing benchmarking. AnyCategoriesFilter will run any test that matches at least one category in the list and AllCategoriesFilter will require all listed to be included.

Additional notes

In BaseBenchmark.cs file, changing EfTestDbContext.Create(false) to true will print out what queries are being ran.

LessRandomBenchmarksTestReturns can be used to prove that JsonSearch_Indexed and TableSearch_Media return the same data for the same search inputs.

Unit tests are included to make sure each search works as expected and can handle bad user data.

Available Benchmarks

Single search
Multi search
- Hard coded value tests
- Random value tests

Single search

Single search was mainly used for the original setup of benchmark dotnet and getting a basis started for search patterns. All of these were quickly dropped for multi search versions if possible. Though these aren't used anymore, they are still available.

Multi search

This section has the hard coded and random sections. Originally random values were used for testing but realized the consistency of choosing what would be returned was also valuable information to these tests. So added the hard coded section of tests. The random tests can be used with any database. The hard coded ones can be used with the provided bacpac file: ef_testing_index_large.bacpac. Except for StringBenchmarks.cs which needs the other bacpac to work: ef_testing_string_large.bacpac.

Less Random
- Tests the normal search patterns with specific data to see if the location of the data in the database might have changed search times.
- Categories: table, json, first, last, set1, set2
Column Count
- Tests the RestrictedColumns search pattern to see if less data returned would improve dynamic search speed. (less columns actually increased search times)
- Categories: table, columncount, first, set1, set2, count10, count25, countall
Split Query
- Tests the normal json against the "normal", SplitQuery, and TwoQueries versions of the dynamic search patterns. (SplitQuery was consistently worse than everything, TwoQueries was better, but json still out performed)
- Categories: table, json, indexed, media, media2, mediasplit, first, last, set1, set2
No Columns
- Tests normal and NoColumns versions of the search patterns. (Vast improvement to dynamic search times, json didn't change)
- Categories: table, json, hascolumns, nocolumns, first, last, set1, set2
Date Field
- Tests the normal and NoColumns versions of the search patterns against date fields
- Categories: table, json, hascolumns, nocolumns, first, set1, set2, set3
Misc
- Tests the normal and NoColumns versions of the search patterns against some of the datasets that had larger differences in search times
- Categories: indexed, media, AllColumns, NoColumns, #fields (1,2,3,4,5,6,7,9), both, req, op, date, int, bool, string
String
- Tests only NoColumn search patterns against different sets of string search values
- Categories: media, json, req, op, both, one, two, three, four, five, six, seven, eight, buildupon, shortstring, longstring, charcount, ten, twelve, fourteen, sixteen, withint
- Uses ef_testing_string_large.bacpac
Json Index
- These tests require database changes between runs for testing how indexing effects the search speed of different types of searches.
- JsonIndexBenchmarks uses ef_testing_index_large.bacpac with the required changes per run listed in the class
- JsonIndexStringBenchmarks uses ef_testing_string_large.bacpac with the required changes per run listed in the class
- A set of tests only using the json NoColumn search pattern

Results

A lot of the benchmarks included in this project were done before the business decision of not returning the media information. Which vastly improved the speed of the dynamic table search patterns. The test results of these have still been included but will not be reviewed here. These sections include: Column Count, Less Random, and Split Query. While the dynamic search patterns improved with the removal of the column data, the json patterns didn't see much difference, but for the sake of consistency I will focus on the No Column version of the json search patterns. However a lot of the test results still include the original version of json and dynamic search so they can be easily compared with the No Column versions.

The plots are generated using the included BuildPlots_alt.R file found here.

No Columns

The information in Tbl 1 shows how the dynamic search pattern improved when columns were not required to be returned. While the benchmarks on the data types for int and bool show small improvements in speed, ones that include strings show a vast improvement based on how many rows were being requested. set1 showing a drastic improvement from 30 seconds down to 1 second. first showing 3 seconds down to .5 seconds.

While Tbl 2 for json shows removing column information has a slight improvement but most of it can be accounted for with the error and standard deviation.

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1766 (21H1/May2021Update)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.300
  [Host]     : .NET 6.0.5 (6.0.522.21309), X64 RyuJIT
  DefaultJob : .NET 6.0.5 (6.0.522.21309), X64 RyuJIT

Tbl 1: No Columns Benchmarks - dynamic

Method	Mean	Error	StdDev
Media_set1_both_int_bool	145.070 ms	2.5972 ms	3.0917 ms
Media_noColumns_set1_both_int_bool	135.077 ms	1.3963 ms	1.2378 ms
Media_set1_req_int_bool	271.996 ms	4.8273 ms	4.2793 ms
Media_noColumns_set1_req_int_bool	269.232 ms	1.9638 ms	1.5332 ms
Media_set1_op_int	72.571 ms	1.0938 ms	0.9696 ms
Media_noColumns_set1_op_int	67.871 ms	0.9793 ms	0.9160 ms
Media_set1_req_string	169.146 ms	3.3757 ms	7.1939 ms
Media_noColumns_set1_req_string	1,551.383 ms	25.1084 ms	23.4864 ms
Media_set1_op_string_single	22,116.768 ms	1,084.1167 ms	3,145.2191 ms
Media_noColumns_set1_op_string_single	427.422 ms	3.4945 ms	2.9181 ms
Media_set1_req_string_single	29,306.243 ms	579.0830 ms	983.3268 ms
Media_noColumns_set1_req_string_single	853.966 ms	13.1366 ms	10.9696 ms
Media_first_both_bool_int	134.972 ms	1.4135 ms	1.2530 ms
Media_noColumns_first_both_bool_int	134.785 ms	1.9739 ms	1.8464 ms
Media_first_req_bool_int	249.123 ms	3.3765 ms	2.8195 ms
Media_noColumns_first_req_bool_int	265.708 ms	5.0133 ms	4.4442 ms
Media_first_op_int	89.694 ms	1.7909 ms	2.3287 ms
Media_noColumns_first_op_int	90.168 ms	1.6429 ms	1.3719 ms
Media_first_req_string	3,132.863 ms	28.2227 ms	26.3995 ms
Media_noColumns_first_req_string	3,125.529 ms	28.1682 ms	26.3485 ms
Media_first_op_string	492.253 ms	6.4743 ms	5.0547 ms
Media_noColumns_first_op_string	494.283 ms	8.9849 ms	9.9867 ms
Media_first_op_string_single	76.057 ms	2.1989 ms	6.4490 ms
Media_noColumns_first_op_string_single	406.639 ms	5.1703 ms	4.3175 ms
Media_first_req_string_single	139.974 ms	2.7730 ms	6.8541 ms
Media_noColumns_first_req_string_single	827.508 ms	4.6228 ms	3.8603 ms

Tbl 2: No Columns Benchmarks - json

Method	Mean	Error	StdDev
Indexed_first_both_bool_int	6.642 ms	0.1323 ms	0.3531 ms
Indexed_noColumns_first_both_bool_int	6.440 ms	0.1281 ms	0.3094 ms
Indexed_first_req_bool_int	28.642 ms	0.5380 ms	0.4492 ms
Indexed_noColumns_first_req_bool_int	27.370 ms	0.3812 ms	0.3184 ms
Indexed_first_op_int	4.765 ms	0.0953 ms	0.2494 ms
Indexed_noColumns_first_op_int	4.515 ms	0.0884 ms	0.1452 ms
Indexed_first_req_string	1,466.409 ms	28.4672 ms	26.6282 ms
Indexed_noColumns_first_req_string	1,463.739 ms	19.8935 ms	18.6084 ms
Indexed_first_op_string	2,511.278 ms	15.3365 ms	14.3458 ms
Indexed_noColumns_first_op_string	2,517.308 ms	24.1128 ms	21.3753 ms
Indexed_first_op_string_single	2,501.888 ms	6.5184 ms	5.0891 ms
Indexed_noColumns_first_op_string_single	2,513.694 ms	19.7399 ms	17.4989 ms
Indexed_first_req_string_single	1,410.806 ms	22.5999 ms	21.1400 ms
Indexed_noColumns_first_req_string_single	1,409.895 ms	26.6260 ms	24.9060 ms
Indexed_set1_both_int_bool	12.186 ms	0.2398 ms	0.2665 ms
Indexed_noColumns_set1_both_int_bool	6.042 ms	0.1145 ms	0.2655 ms
Indexed_set1_req_int_bool	17.135 ms	0.3383 ms	0.5064 ms
Indexed_noColumns_set1_req_int_bool	13.913 ms	0.2726 ms	0.4163 ms
Indexed_set1_op_int	7.768 ms	0.1404 ms	0.1313 ms
Indexed_noColumns_set1_op_int	3.700 ms	0.0735 ms	0.1910 ms
Indexed_set1_req_string	420.444 ms	8.3722 ms	7.8313 ms
Indexed_noColumns_set1_req_string	418.181 ms	8.0565 ms	6.7275 ms
Indexed_set1_op_string_single	2,514.774 ms	47.6085 ms	44.5330 ms
Indexed_noColumns_set1_op_string_single	2,503.007 ms	44.4445 ms	41.5734 ms
Indexed_set1_req_string_single	1,404.229 ms	23.8953 ms	22.3517 ms
Indexed_noColumns_set1_req_string_single	1,403.096 ms	19.6017 ms	18.3354 ms

Fig 1: No Columns Benchmarks - first

Fig 2: No Columns Benchmarks - set1

Fig 3: No Columns Benchmarks - strings only

Date Field

Date fields were not originally included in the benchmarks because additional work needed to be done to have them work properly. The dynamic search pattern needed a custom EF function added to allow for date conversions on the sql side. While json needed dates converted into a deterministic style to allow for indexing to be added. While the search for int and bool are exact matches, date search was setup to look for values with equal or greater dates than the one provided.

Tbl 3 is a collection of data taken from the previous tables and the benchmark runs for Date fields to compare the search times for dates against those without. Dates can be seen to slightly increase search time for both json and dynamic search styles. Most drastically for dynamic when searching over both required and optional fields with a difference of 322ms. Json is seen to consistently run faster than the dynamic search.

Tbl 3: Date Field Benchmarks

Method	Mean	Error	StdDev
Indexed_noColumns_set1_req_date	13.08 ms	0.259 ms	0.467 ms
Indexed_noColumns_set1_req_int_bool	13.913 ms	0.2726 ms	0.4163 ms
Media_noColumns_set1_req_date	352.69 ms	6.394 ms	5.339 ms
Media_noColumns_set1_req_int_bool	269.232 ms	1.9638 ms	1.5332 ms
Indexed_noColumns_set1_op_date	37.43 ms	0.253 ms	0.211 ms
Indexed_noColumns_set1_op_int	3.700 ms	0.0735 ms	0.1910 ms
Media_noColumns_set1_op_date	113.82 ms	1.351 ms	1.264 ms
Media_noColumns_set1_op_int	67.871 ms	0.9793 ms	0.9160 ms
Indexed_noColumns_set1_both_date	158.5 ms	3.12 ms	2.61 ms
Indexed_noColumns_set1_both_int_bool	6.042 ms	0.1145 ms	0.2655 ms
Media_noColumns_set1_both_date	456.7 ms	2.62 ms	2.19 ms
Media_noColumns_set1_both_int_bool	135.077 ms	1.3963 ms	1.2378 ms
Indexed_noColumns_first_both_date_int_bool	15.07 ms	0.298 ms	0.446 ms
Indexed_noColumns_first_both_bool_int	6.440 ms	0.1281 ms	0.3094 ms
Media_noColumns_first_both_date_int_bool	398.38 ms	5.232 ms	4.638 ms
Media_noColumns_first_both_bool_int	134.785 ms	1.9739 ms	1.8464 ms

Misc

The misc benchmarks are a collection of benchmarks I pulled from other places when I wanted to further dig into what was causing the odd search times in certain scenarios. It also includes extra benchmarks to compare to the oddities.

Tbl 4 shows a collection of different benchmarks comparing json and dynamics no columns search patterns. For the data types of int, bool, and date, json is consistently faster than the dynamic search. Tbl 5 is string only searches and is where the data gets strange. In most test cases where all fields searched are required, json runs faster. However, there seems to be a correlation between the field count and when the dynamic pattern would run faster. Fo the optional field searches, json was consistently worse than the dynamic search. Which leads into the next set of benchmarks focused on strings that go into more detail on these oddities.

Tbl 4: Misc Benchmarks - int, bool, and dates

Method	Mean	Error	StdDev	Field Count
Media_NoColumns_first_both_bool_int	137.736 ms	1.7443 ms	1.5463 ms	6
Indexed_NoColumns_first_both_bool_int	4.225 ms	0.0903 ms	0.2621 ms	6
Media_NoColumns_first_req_bool_int	226.766 ms	2.4157 ms	2.1415 ms	7
Indexed_NoColumns_first_req_bool_int	7.471 ms	0.1444 ms	0.2333 ms	7
Media_NoColumns_first_op_int	90.938 ms	1.4189 ms	1.2578 ms	4
Indexed_NoColumns_first_op_int	3.967 ms	0.0771 ms	0.2174 ms	4
Media_NoColumns_first_both_date_int_bool	404.453 ms	4.1403 ms	3.4573 ms	9
Indexed_NoColumns_first_both_date_int_bool	14.679 ms	0.2885 ms	0.4491 ms	9
Media_NoColumns_set1_req_date	352.554 ms	6.6846 ms	5.9258 ms	3
Indexed_NoColumns_set1_req_date	13.153 ms	0.2595 ms	0.4264 ms	3
Media_NoColumns_set1_op_date	118.228 ms	2.2492 ms	2.3097 ms	2
Indexed_NoColumns_set1_op_date	37.716 ms	0.7233 ms	0.9147 ms	2
Media_NoColumns_set1_both_date	464.646 ms	5.6165 ms	4.6900 ms	5
Indexed_NoColumns_set1_both_date	157.691 ms	2.4201 ms	2.5894 ms	5

Tbl 5: Misc Benchmarks - strings

Method	Mean	Error	StdDev	Field Count
Media_NoColumns_first_req_string	3,184.654 ms	60.3905 ms	56.4893 ms	4
Indexed_NoColumns_first_req_string	1,502.366 ms	19.4950 ms	18.2357 ms	4
Media_NoColumns_first_req_string_single	851.483 ms	14.8172 ms	13.8600 ms	1
Indexed_NoColumns_first_req_string_single	1,430.889 ms	22.8104 ms	21.3369 ms	1
Media_NoColumns_set1_req_string	1,569.290 ms	25.3611 ms	23.7227 ms	2
Indexed_NoColumns_set1_req_string	426.467 ms	5.0285 ms	4.4576 ms	2
Media_NoColumns_set1_req_string_single	865.013 ms	9.8744 ms	8.7534 ms	1
Indexed_NoColumns_set1_req_string_single	1,429.871 ms	27.8590 ms	27.3612 ms	1
Media_NoColumns_set2_req_string	1,582.771 ms	29.5328 ms	27.6250 ms	2
Indexed_NoColumns_set2_req_string	415.466 ms	3.1280 ms	2.4421 ms	2
Media_NoColumns_set2_req_string_single	839.025 ms	3.2153 ms	2.5103 ms	1
Indexed_NoColumns_set2_req_string_single	1,437.138 ms	28.1644 ms	35.6189 ms	1
Media_NoColumns_first_op_string	494.814 ms	9.5839 ms	9.8420 ms	3
Indexed_NoColumns_first_op_string	2,589.288 ms	36.7829 ms	32.6071 ms	3
Media_NoColumns_first_op_string_single	415.320 ms	6.4889 ms	5.7522 ms	1
Indexed_NoColumns_first_op_string_single	2,576.569 ms	30.3320 ms	28.3726 ms	1
Media_NoColumns_set1_op_string_single	437.780 ms	8.5108 ms	7.1069 ms	1
Indexed_NoColumns_set1_op_string_single	2,577.872 ms	41.3992 ms	38.7249 ms	1

Fig 4: Misc Benchmarks

String

These benchmarks use a different database than those above. The file for it has been included: ef_testing_string_large.bacpac. This database was setup specifically to test different scenarios for string searches. It is setup with 41 fields. 20 string columns that are required, 20 that are optional, and one additional column for an int.

String Categories

normal - Each search has a random selection of values with an increasing amount of fields searched on
buildupon - Each search has the same search terms as the previous plus one new one
shortstring - Same as buildupon but each search field uses only 5 character long strings
longstring - Same as buildupon but each search field uses strings of length 10
charcount - Each search is over 4 fields with the lengths of the strings increasing in each search
withint - A copy of buildupon with a required int field added onto each search
optionalint - Similar to buildupon with an optional int field added onto each search

Normal

This first run was somewhat informative. It showed a basic pattern that the dynamic search over strings takes longer based on how many fields are searched over. While the json search changes based on the types of fields searched. With optional fields taking longer than dynamic and required only fields being much faster than the dynamic version. However this data seemed inconsistent because json had some random spikes in its search times.

Tbl 6 shows the raw data and Fig 5 the basic plot for this. This format wasn't very informative so I added additional plots (Fig 6, Fig 7, Fig 8) for better comparing of these search patterns.

Tbl 6: String Benchmarks - Normal

Method	Mean	Error	StdDev
Media_Req_One	153.63 ms	3.046 ms	4.368 ms
Json_Req_One	26.43 ms	0.521 ms	0.714 ms
Media_Req_Two	1,721.58 ms	11.659 ms	10.336 ms
Json_Req_Two	493.26 ms	6.767 ms	6.330 ms
Media_Req_Three	2,572.17 ms	19.979 ms	17.711 ms
Json_Req_Three	1,966.27 ms	11.146 ms	9.880 ms
Media_Req_Four	3,409.05 ms	26.122 ms	24.435 ms
Json_Req_Four	2,383.64 ms	9.347 ms	8.743 ms
Media_Req_Five	4,234.82 ms	10.756 ms	8.982 ms
Json_Req_Five	1,402.98 ms	7.307 ms	5.705 ms
Media_Req_Six	5,221.28 ms	95.834 ms	89.643 ms
Json_Req_Six	2,594.07 ms	14.244 ms	13.324 ms
Media_Req_Seven	6,009.03 ms	16.013 ms	14.979 ms
Json_Req_Seven	2,609.36 ms	9.405 ms	8.797 ms
Media_Req_Eight	6,827.40 ms	19.107 ms	17.872 ms
Json_Req_Eight	4,680.04 ms	10.539 ms	9.342 ms
Media_Op_One	62.31 ms	1.176 ms	1.307 ms
Json_Op_One	28.05 ms	0.264 ms	0.234 ms
Media_Op_Two	863.29 ms	1.770 ms	1.569 ms
Json_Op_Two	2,206.16 ms	15.537 ms	12.974 ms
Media_Op_Three	1,362.10 ms	9.611 ms	8.520 ms
Json_Op_Three	5,340.91 ms	19.233 ms	17.050 ms
Media_Op_Four	1,798.65 ms	3.925 ms	3.671 ms
Json_Op_Four	5,403.10 ms	33.622 ms	31.450 ms
Media_Op_Five	2,185.05 ms	4.914 ms	4.597 ms
Json_Op_Five	5,371.52 ms	21.022 ms	18.635 ms
Media_Op_Six	2,669.32 ms	18.444 ms	17.253 ms
Json_Op_Six	5,461.66 ms	10.487 ms	9.810 ms
Media_Op_Seven	2,793.89 ms	5.922 ms	4.624 ms
Json_Op_Seven	6,735.02 ms	16.532 ms	15.464 ms
Media_Op_Eight	3,486.25 ms	13.419 ms	11.896 ms
Json_Op_Eight	5,306.52 ms	18.558 ms	17.359 ms
Media_Both_Two	1,338.78 ms	9.500 ms	8.887 ms
Json_Both_Two	2,536.93 ms	10.512 ms	9.833 ms
Media_Both_Four	2,693.03 ms	10.054 ms	9.404 ms
Json_Both_Four	5,431.76 ms	14.023 ms	13.117 ms
Media_Both_Six	3,682.96 ms	12.264 ms	11.471 ms
Json_Both_Six	3,665.26 ms	8.397 ms	7.444 ms
Media_Both_Eight	5,169.48 ms	21.261 ms	18.848 ms
Json_Both_Eight	5,442.79 ms	13.986 ms	13.083 ms

Fig 5: String Benchmarks - Normal - Box Plot

Fig 6: String Benchmarks - Normal - Line Plot - Required Only

Fig 7: String Benchmarks - Normal - Line Plot - Optional Only

Fig 8: String Benchmarks - Normal - Line Plot - Both

Build Upon

Similar to the last benchmark, this one was made to get more consistent data. Each test builds on the previous set of fields so that all of the search fields are the same except for the newest one added. Example:

{ 1 , "possimus" }

{ 1 , "possimus" },
{ 2 , "tempore" }

{ 1 , "possimus" },
{ 2 , "tempore" },
{ 3 , "numquam" }

This style looks to have worked. As there is no longer any unexpected changes in the data. The plots Fig 10, Fig 11, and Fig 12 show the new and previous tests for comparison. While the dynamic(media) search shows a consistent increase for every field added, json displays a consistent speed for any fields added after the third, seen in Fig 10. Which vastly increases how fast it does a search compared to dynamic when all fields are required. However, while json is still very consistent for optional fields, media is faster though the difference decreases the more fields that are added, as seen in Fig 11 and Fig 12.

Tbl 7: String Benchmarks - Build Upon

Method	Mean	Error	StdDev
Media_Req_One_BuildUpon	146.83 ms	2.882 ms	4.735 ms
Json_Req_One_BuildUpon	26.24 ms	0.439 ms	0.343 ms
Media_Req_Two_BuildUpon	1,598.75 ms	25.381 ms	22.500 ms
Json_Req_Two_BuildUpon	326.68 ms	3.764 ms	3.336 ms
Media_Req_Three_BuildUpon	2,412.93 ms	33.383 ms	31.226 ms
Json_Req_Three_BuildUpon	1,276.54 ms	16.827 ms	14.917 ms
Media_Req_Four_BuildUpon	3,181.99 ms	23.472 ms	21.955 ms
Json_Req_Four_BuildUpon	1,270.15 ms	15.886 ms	14.860 ms
Media_Req_Five_BuildUpon	3,987.76 ms	13.665 ms	12.114 ms
Json_Req_Five_BuildUpon	1,269.49 ms	14.888 ms	13.926 ms
Media_Req_Six_BuildUpon	4,768.32 ms	28.002 ms	31.124 ms
Json_Req_Six_BuildUpon	1,270.05 ms	11.058 ms	10.344 ms
Media_Req_Seven_BuildUpon	5,562.17 ms	23.576 ms	22.053 ms
Json_Req_Seven_BuildUpon	1,264.94 ms	9.581 ms	8.000 ms
Media_Req_Eight_BuildUpon	6,349.26 ms	38.729 ms	34.332 ms
Json_Req_Eight_BuildUpon	1,268.51 ms	16.261 ms	15.211 ms
Media_Op_One_BuildUpon	100.09 ms	1.952 ms	2.922 ms
Json_Op_One_BuildUpon	165.85 ms	3.302 ms	3.391 ms
Media_Op_Two_BuildUpon	801.75 ms	5.369 ms	4.484 ms
Json_Op_Two_BuildUpon	3,619.08 ms	43.439 ms	40.633 ms
Media_Op_Three_BuildUpon	1,205.82 ms	15.057 ms	14.084 ms
Json_Op_Three_BuildUpon	4,761.12 ms	12.263 ms	10.871 ms
Media_Op_Four_BuildUpon	1,616.83 ms	14.126 ms	13.214 ms
Json_Op_Four_BuildUpon	4,760.38 ms	43.668 ms	34.093 ms
Media_Op_Five_BuildUpon	2,013.39 ms	23.448 ms	21.933 ms
Json_Op_Five_BuildUpon	4,751.75 ms	10.917 ms	9.116 ms
Media_Op_Six_BuildUpon	2,402.93 ms	15.626 ms	13.852 ms
Json_Op_Six_BuildUpon	4,764.77 ms	36.133 ms	30.172 ms
Media_Op_Seven_BuildUpon	2,818.16 ms	34.750 ms	32.505 ms
Json_Op_Seven_BuildUpon	4,757.01 ms	9.972 ms	8.840 ms
Media_Op_Eight_BuildUpon	3,208.73 ms	32.084 ms	30.012 ms
Json_Op_Eight_BuildUpon	4,755.69 ms	17.241 ms	14.397 ms
Media_Both_Two_BuildUpon	1,191.22 ms	22.021 ms	20.598 ms
Json_Both_Two_BuildUpon	2,667.68 ms	27.364 ms	24.257 ms
Media_Both_Four_BuildUpon	2,390.84 ms	35.557 ms	33.260 ms
Json_Both_Four_BuildUpon	4,758.69 ms	4.975 ms	3.884 ms
Media_Both_Six_BuildUpon	3,404.36 ms	31.111 ms	29.101 ms
Json_Both_Six_BuildUpon	4,762.67 ms	15.341 ms	13.600 ms
Media_Both_Eight_BuildUpon	4,593.04 ms	7.522 ms	6.668 ms
Json_Both_Eight_BuildUpon	4,756.97 ms	8.630 ms	7.651 ms

Fig 9: String Benchmarks - Build Upon - Box Plot

Fig 10: String Benchmarks - Build Upon - Line Plot - Required Only

Fig 11: String Benchmarks - Build Upon - Line Plot - Optional Only

Fig 12: String Benchmarks - Build Upon - Line Plot - Both

Short String & Long String

These two benchmarks turned out very similar in speeds. With any difference being quite negligible. The most interesting part of this benchmark is that the both tests for the json search pattern actually run faster than the dynamic search patterns. As with previous tests json usually takes longer than the dynamic search. However, I'm unable to explain why this test ended up so different compared to previous runs.

Fig 13: String Benchmarks - Short String - Line Plot - Both

Fig 14: String Benchmarks - Long String - Line Plot - Both

Char Count

This test was over four fields for each test but with the character count for each search changing. This data is displayed in Fig 15 and shows that each search is very consistent. With the json patterns slightly decreasing with longer strings and dynamic patterns slightly increasing. However these changes can be explained by the error and standard deviation.

Fig 15: String Benchmarks - Char Count - Line Plot

With Required and Optional Int

The tests withint and optionalint are similar to the buildupon tests but with an int field added onto each search (not included in field counts). This benchmark shows how drastically the search time reduces by having a single non-string field included in the search terms. The full comparison can be seen in Fig 16.

However, the times between json and dynamic search patterns are not so different, though json is consistently the faster of the two. This can be seen with Fig 17 and Fig 18.

Fig 16: String Benchmarks - With int Compared to Build Upon - Line Plot - Optional

Fig 17: String Benchmarks - With int - Line Plot - Optional

Fig 18: String Benchmarks - With int - Line Plot - Required

Json Index Benchmarks

This ended up being a combination of a few different benchmarks and tests on the two provided databases. Rather than making multiple copies of the tables for the different combination of tests with and without indexing, I simply added and removed the needed indexes between each run of the tests.

Index Benchmarks

JsonIndexBenchmarks uses ef_testing_index_large.bacpac
- With indexes - This was used as a baseline run for comparing with removed indexes
- Without searched indexes - Removed indexes on columns 1 and 2 that were used in the searches
- Without unsearched indexes - Added the removed indexes back and then removed indexes for columns 12, 31, 32, and 47, to cover a combination of required and optional fields.
JsonIndexStringBenchmarks uses ef_testing_string_large.bacpac
- With indexes - This was used as a baseline run for comparing with removed indexes
- Without searched indexes - Removed indexes on columns 1 and 48 that were used in the searches
- Without unsearched indexes - Added the removed indexes back and then removed indexes for columns 4, 6, 37, and 39, to cover a combination of required and optional fields.
Misc Test - Uses the Misc Benchmarks file
- Baseline - A normal run of the Misc benchmarks to use as comparison
- No indexing - A run of the Misc Benchmarks after all indexes on the database have been removed

Json Index

This benchmark uses ef_testing_index_large.bacpac. And was ran three times: All columns indexed, some searched fields with indexes removed, and once with all search fields having indexes and some that were not searched having indexes removed.

In this test I focused on basic data types, Int and Bool. Fig 19 shows how including indexes on these columns types is negligible for required and optional fields.

Fig 19: Json Index Benchmarks - Box Plot

Json Index String

This benchmark uses ef_testing_string_large.bacpac. And was ran three times: All columns indexed, some searched fields with indexes removed, and once with all search fields having indexes and some that were not searched having indexes removed. Fig 20 shows how including indexes on these searches was negligible for required and optional fields.

Fig 20: Json Index String Benchmarks - Box Plot

Misc Test

After the previous two benchmarks seemed to display that indexing was not necessary, I decided to run a more thorough test using the Misc benchmarks. Once with all indexes included to be the baseline for comparison, and once with all indexes removed to see if there were specific cases where indexing improved performance.

Tbl 8 shows the results of both runs, Baseline and NotIndexed. With most comparisons showing not enough change to make a difference. It looks like the only columns that see any improvement in search time with an index are optional primitive fields.

Tbl 8: Json Index Misc Benchmarks

Method	Categories	Mean	Error	StdDev
BaseLine_first_req_bool_int	7fields,req,bool,int	7.46 ms	0.148 ms	0.363 ms
NotIndexed_first_req_bool_int	7fields,req,bool,int	43.92 ms	0.875 ms	1.644 ms
BaseLine_first_op_int	4fields,op,int	3.89 ms	0.078 ms	0.180 ms
NotIndexed_first_op_int	4fields,op,int	332.40 ms	6.556 ms	5.811 ms
BaseLine_first_both_bool_int	6fields,both,bool,int	4.53 ms	0.089 ms	0.122 ms
NotIndexed_first_both_bool_int	6fields,both,bool,int	337.93 ms	6.630 ms	9.076 ms
BaseLine_set1_req_date	3fields,req,date	13.23 ms	0.262 ms	0.269 ms
NotIndexed_set1_req_date	3fields,req,date	13.81 ms	0.166 ms	0.148 ms
BaseLine_set1_op_date	2fields,op,date	38.54 ms	0.649 ms	0.607 ms
NotIndexed_set1_op_date	2fields,op,date	38.78 ms	0.701 ms	0.656 ms
BaseLine_set1_both_date	5fields,both,date	157.39 ms	3.027 ms	2.683 ms
NotIndexed_set1_both_date	5fields,both,date	156.83 ms	0.935 ms	0.874 ms
BaseLine_first_both_date_int_bool	9fields,both,bool,int,date	14.83 ms	0.295 ms	0.384 ms
NotIndexed_first_both_date_int_bool	9fields,both,bool,int,date	315.47 ms	5.305 ms	4.702 ms
BaseLine_first_req_string_single	1fields,req,string	1,417.21 ms	9.897 ms	9.257 ms
NotIndexed_first_req_string_single	1fields,req,string	1,413.70 ms	7.828 ms	6.939 ms
BaseLine_set1_req_string_single	1fields,req,string	1,412.81 ms	10.720 ms	10.027 ms
NotIndexed_set1_req_string_single	1fields,req,string	1,433.07 ms	28.497 ms	34.997 ms
BaseLine_set2_req_string_single	1fields,req,string	1,388.77 ms	9.584 ms	8.965 ms
NotIndexed_set2_req_string_single	1fields,req,string	1,406.49 ms	27.769 ms	29.712 ms
BaseLine_set1_req_string	2fields,req,string	417.36 ms	4.475 ms	3.967 ms
NotIndexed_set1_req_string	2fields,req,string	432.84 ms	8.642 ms	17.458 ms
BaseLine_set2_req_string	2fields,req,string	413.73 ms	3.119 ms	2.605 ms
NotIndexed_set2_req_string	2fields,req,string	415.97 ms	6.713 ms	5.950 ms
BaseLine_first_req_string	4fields,req,string	1,474.58 ms	9.438 ms	8.367 ms
NotIndexed_first_req_string	4fields,req,string	1,475.95 ms	11.583 ms	10.268 ms
BaseLine_set1_op_string_single	1fields,op,string	2,616.05 ms	51.374 ms	88.617 ms
NotIndexed_set1_op_string_single	1fields,op,string	2,588.05 ms	35.577 ms	33.278 ms
BaseLine_first_op_string_single	1fields,op,string	2,548.62 ms	13.041 ms	11.560 ms
NotIndexed_first_op_string_single	1fields,op,string	2,554.74 ms	15.754 ms	13.965 ms
BaseLine_first_op_string	3fields,op,string	2,553.84 ms	16.913 ms	14.993 ms
NotIndexed_first_op_string	3fields,op,string	2,561.59 ms	16.619 ms	14.732 ms

Fig 21: Json Index Misc Benchmarks - Box Plot

Name		Name	Last commit message	Last commit date
Latest commit History 187 Commits
data files		data files
readme files		readme files
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Entity Framework Vs Json Queries

Contents

Data Model

Test Data

CreateBogusData.cs

Search Patterns

Json Patterns

Dynamic Patterns

Running

Additional notes

Available Benchmarks

Single search

Multi search

Categories

Results

No Columns

Date Field

Misc

String

String Categories

Normal

Build Upon

Short String & Long String

Char Count

With Required and Optional Int

Json Index Benchmarks

Index Benchmarks

Json Index

Json Index String

Misc Test

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages