Přihlásit | Registrovat
Univerzita Tomáše Bati ve Zlíně
TRILOBIT
Rychlostní rozdíly mezi softwarem Mathematica a jazykem C# při hledání extrémů

Rychlostní rozdíly mezi softwarem Mathematica a jazykem C# při hledání extrémů

David Malaník | 11. 11. 2010 18:36:53
Zařazení: Informatika|Číslo 2/2010|Vědecká stať

Abstrakt

Příspěvek se zabývá srovnáním rychlosti matematických výpočtů při hledání extrémů vícerozměrných funkcí. Ke srovnání výkonnosti byly použity dva diametrálně odlišné prostředky pro řešení. Na jedné straně se jedná o softwarové řešení pro matematické simulace Wolfram Mathematica 7, na druhé straně o univerzální programovací jazyk C# [1]. Při hledání extrémních hodnot jedno i vícerozměrných funkcí je ale rychlost provádění matematických operací stěžejní. Na ní totiž záleží rychlost nalezeného extrému. Jako srovnávací prvek tedy bude sloužit počet ohodnocení vícerozměrné funkce za 1 sekundu.

Všechny simulace byly prováděny na jednom PC bez změny jeho konfigurace.

Abstract

Paper showing speed differences of mathematical calculation in application designed for finding extreme values of multidimensional functions. For calculation of speed differences were used two absolutely different software solutions for this application. First, there was Wolfram Mathematica version 7, at the opposite side there is universal computing language C# [2]. In multidimensional function extreme finding application is speed of used solutions very important because speed of these algorithms is in close binding with speed of finding extreme value of multidimensional function. In this paper is for showing of differences used parameter which represented number of calculation multidimensional function for 1 second.

All simulations were processing on one PC without any configuration changes.

 Introduction

Question about speed of mathematical operation is so important at this time because we need some software solutions for calculating extreme value of multidimensional problems which we might represented by multidimensional functions with one or more extreme values. More and more of actual World problems we will solve by computers. But in many cases we spend more time with mathematical definition of this problem, but this is only first step. Second step is solution of this problem in some application software. And for this process is speed of used system more important. It is not so extraordinary to solving a problem which is represented by function with 10 or 100 dimensions. For humans is too difficult to imagine how to draw more than 4 or 5 dimensions space, but for computers it is not be a problem. Computers just see 100 parameters of one function. It is not necessary to drawing this problem and trying to imagine 100 dimensions space. If we used computers for finding extreme value of this designed function, we have only one question. This question is how much time computer spends with solution of this problem. This is a question of speed. This question has two main parts. First part is speed of used algorithms second part is speed of computer. But second part will be decomposition to two problems. First problem of this decomposition is speed of used computer. In this case is not so important because speed of present computers will be changed very fast. Second part of this decomposition is speed of used application. In many cases is this problem very important because we might saw that software cannot effectively use power of computer. In this case we might use most powerful computer in the World and we do not speed up this calculation.

Question of speed is probably most important question of present computer world in many parts of computer sciences.

 Specification of testing environment

For testing was used personal computer with specification shown in Table 1.

CPU

Intel i7 920@2,83GHz (4 cores , 8 threads)

RAM

6GB DDR II – 1600MHz

HDD

RAID5 volume (approx. read 170MB/s, write 150MB/s)

Graphic card

NVidia GeForce 9600GT 1GB VRAM( 64 CUDA[3] 3.0 cores)

Operation System

Microsoft Windows 7 Professional

 

Wolfram software

Mathematica 7

Program language

C# .Net Framework 4.0

Table 1 - Computer specification

Extreme finding methods were writen in Mathematica 7 environment and C#. Both codes was similar. There was only changes with number of operation for calculating extreme value.  Language c# does not has wide mathematic library as Mathematica and some algorithm must be solved with custom functions. Code written in c# is longer than in Mathematica 7.

Used method is not so important because in this paper is used speed of calculation function value for showing doferences between both solution.

Competition situation is shown on Figure 1. This tests were only between Mathematica 7 and C# language.

Figure 1 - Competition sides

 Specification of testing method

Test procedure was decomposed to 3 parts. In Mathematica 7 environment were tested 10 and 20 dimensions function. In C# language environment were tested 10, 20 and 50 dimensions function because simulation was faster than in Mathematica 7.

In first part were tested speed differences between applications which using only one thread. In this case is evolution of computer power totally blocked because modern computer used more than 1 thread (or core) inside computer processor unit.

In second part used all accessible threads in computer (8 threads on testing computer).  In this solution must be possible to parallelize used algorithm. Extreme finding algorithms for multidimensional function is easy parallelized because it is composed from many same operation in which algorithm calculate value on specific coordinates. In this case is possible to maximally use power of computer. In these test we used all processor cores (threads). As parallelization of used algorithms was used automatic parallelization function in both applications. In Mathematica 7 was used function “Parallelize[]”. In C# language was used build in function “Parallel.For()”; There was not any manual changes for main testing algorithm.    

Third part shall describe future solution for using graphic card for this calculation. In this part will be described potential of modern graphic cards for specific part of mathematical calculation.

 Testing reports

Single thread Mathematica 7

In this part was tested single thread version of testing algorithm. In Mathematica 7 were tested only 10 and 20 dimensions functions.

Results of 10 dimensions simulation are shown in Table 2. During simulation were calculating more than 2,8 million values of testing function, which took more than 13 minutes.

Simulation

10

Dimensions

10

Calculating value

2 861 760

Time

13m 41s (821s)

Calculation/sec

3 485.7

Table 2 - Simulation result 10D Mathematica 7(Single thread)

Results of 20 dimensions simulation are shown in Table 3. In simulation were calculating 8 million values of testing function, which took more than 1 hour.

Simulation

10

Dimensions

20

Number of calculation

7 963 040

Time

1h 10m 36s (4 236s)

Calculation/sec

1 879.8

Table 3 - Simulation result 20D Mathematica 7(Single thread)

On Figure 2 is CPU history usage graph during simulation. This is simulation which running only in one thread. But there is not any full usage CPU core. Single thread if this application does not be so effective. In this case is some unused power of computer.

 

Figure 2 - Single thread Mathematica 7 during calculation of extreme value

 Single thread C#

This is a report of single thread C# language testing. Testing application was created as simple console application for elimination influence of graphic user interface. Application was crested as single thread application without any parallel section. Code of these testing algorithms was larger than in Mathematica because C# language does not contain some mathematical function such as random real generator with specific range and precision.

Testing report from 10 dimensions simulation is shown in Table 4. Simulation used more than 79 million of calculation testing function and took approx. 8 minutes.

Simulation

10

Dimensions

10

Calculating value

79 055 739

Time

7m 42s (462s)

Calculation/sec

171 116.3

Table 4 - Simulation result 10D C# (Single thread)

Test with 20 dimensions produced more than 158 million value of testing function. This simulation set took more than 17 minutes. Result is shown in Table 5.

Simulation

10

Dimensions

20

Calculating value

158 513 565

Time

17m 29s (1049s)

Calculation/sec

151 109.2

Table 5 - Simulation result 20D C# (Single thread)

Last simulation set contain 10 simulation of 50 dimensions function. Test produced approx. 400 million values of testing function and took more than 1 hour. Result is shown in Table 5.

Simulation

10

Dimensions

50

Calculating value

397 474 241

Time

1h 4m 14s (3854s)

Calculation/sec

103 132.9

Table 6 - Simulation result 50D C# (Single thread)

On Figure 3 is CPU history usage graph during simulation. This is simulation which running only in one thread. But there is not any full usage CPU core. Single thread if this application does not be so effective. In this case is some unused power of computer. But operation system use more than one core for calculation. Single thread application in this case can use more computing capacity than used Mathematica. This difference was more than 100%.  Single thread application in C# language was better optimized for running on multicore processors.

Figure 3 – Single thread C# during calculation of extreme value

 Multi thread Mathematica 7

This part is based on test report with usage of build in automatic parallelization function on both “languages”. First report was from Mathematica with usage of build in function “Parallelize[]”. Test was same. Ten simulations with 10 and 20 dimensions testing function. First test report from 10D function is shown in Table 6.  

Simulation

10

Dimensions

10

Calculating value

3 295 360

Time

18m 35s (1115s)

Calculation/sec

2 955.5

Table 7 - Simulation result 10D Mathematica 7(8 thread)

Second test with Mathematica was on 20 dimensions test function.  For parallelization was used function “Parallelize[]”. Report from this test is shown in Table 7.

Simulation

10

Dimensions

20

Calculating value

5 529 280

Time

51m 31s (3091s)

Calculation/sec

1 788.8

Table 8 - Simulation result 20D Mathematica 7(8 thread)

Figure 4 shown CPU usage graph during this calculation. Hovever automatic function can not switch usage to all cores of processor. On picture is shown that application used only 3 cores. One for 75 % and others for 20%. Limit of power is approx. Same that in single thread version. No power up. There was a little decreasing of power because part of process was consume for paralelization and synchronization of all threads.

Figure 4 - Multi thread Mathematica 7 during calculation of extreme value

 Multi thread C#

These tests were done with multithread version of test algorithm in C# language. From version 4.0 there was function for parallelization of cycles. For test was used function “Parallel.For()”. There were three tests, first with 10 dimensions function, second with 20 dimensions function and third with 50 dimensions function.

Report from first test with 10 dimensions testing function is shown in Table 8. Simulation produced more than 78 million values of function and took approx. 4 minutes which is approx. 50% of time which took single thread version.

Simulation

10

Dimensions

10

Calculating value

78 572 619

Time

3m 58s (238s)

Calculation/sec

330 137

Table 9 – Simulation 10D C# (8 thread)

 

Test report which used 20 dimensions function is shown in Table 9. This test produced more than 157 million values of testing function and took approx. 9.5 minute which is approx. 55% of time which took single thread version of application.

Simulation

10

Dimensions

20

Calculating value

157 422 707

Time

9m 30s (570s)

Calculation/sec

276 180.2

Table 10 – Simulation 20D C# (8 thread)

 Last test used 50 dimension test function. Report was shown in Table 10. This simulation set produced approx. 400 million value of tested function. Simulation set took approx. 34.5 minutes which is less than 50% of time witch took single thread version.

Simulation

10

Dimensions

50

Calculating value

396 027 020

Time

34m 32s (2072s)

Calculation/sec

191 132.7

Table 11 – Simulation 50D C# (8 thread)

Figure 5 showing CPU usage graph during calculation extreme value with multithread version of test algorithm created in C# language. Application can used all computing capacity on each available processor cores in computer. This version of algorithm was core independent and it was easy used on other systems without any rebuilding. Multithread application was more than 2 times faster than single thread. Some of this calculation capapcity was lost with synchronization between thread and for parallel operation function but speed of algoritms increased. 

Figure 5 - Multi thread C# during calculation of extreme value

Comparison of speed

This part compares speed differences between both technologies. In first part is comparison between single thread versions of testing application. Differences were shown in Table 11 and on Figure 6.

Test

Mathematica 7

C# 4.0 language

Dimension

10

20

10

20

50

Calculating

2 861 760

7 963 040

79 055 739

158 513 565

397 474 241

Time

13m 41s

1h 10m 36s

7m 42s

17m 29s

1h 4m 14s

Calculation/sec

3 485,7

1 879,8

171 116,3

151 109,2

103 132,9

Table 12 - Single thread comparison

This part is more objective that second part, because it is independent on used automatic parallel function. But in table are differences well show. Speed in C# is more than 50 times faster.

 

Figure 6 - Single thread - operation/second

Second part contains simulation with multi thread versions of testing methods. In this test is little problem with automatic parallelization function in Mathematica 7 which does not any effect. But increasing of multithread of C# application was very good shown. Summary reports from multithread tests were shown in Table 12 and on Figure 7.

Test

Mathematica 7

C# 4.0 language

Dimension

10

20

10

20

50

Calculating

3 295 360

5 529 280

78 572 619

157 422 707

396 027 020

Time

18m 35s

51m 31s

3m 58s

9m 30s

34m 32s

Calculation/sec

2 955,5

1 788,8

330 137

276 180,2

191 132,7

Table 13 - Multi thread comparison

 

Figure 7 - Multi thread - operation/second

 

Summary graph with both versions of application is shown on Figure 8.

Figure 8 - Single thread vs. multi thread - operation/second

Figure 9 show time of single thread simulation set. Interesting is that c# simulation was more calculation values than Mathmeatica simulation set (more than 25x).

Figure 9 - Single thread - calculation time

Figure 10 show time of multi thread simulation set. Interesting is that c# simulation was more calculation values than Mathmeatica simulation set (more than 25x).

Figure 10 - Single thread - Calculation time

Summary

Mathematica 7

Advantages

Disadvantage

more precission – found extreme value with hight precission

speed

better visualisation functions

less CPU usage in single thread application

oriented to mathamatic function

unstable parallel function

easy for usage

 

Table 14 - Summary Mathematica 7

 

C# 4.0 framework

Advantages

Disadvantage

speed

less numerous precission

better usage of CPU in sigle thread applications

no visualization tools (might use WPF[4], SilverLight)

better parallel optimalization

less mathematical functions

easy extendable

coding oriented

free tools for coding

 

Table 14 - Summary Mathematica 7

Next steps for speedup   

Next possible step for speedup of these algorithms for finding extreme value of multidimensional function is porting these algorithms to NVidia CUDA platform which might use graphic cars processor for parallel computing. Modern graphic card contain 480 parallel CUDA computing cores so one card might use 480 thread for computing. This number of available thread might be solutions for “cheap” supercomputing for calculating extreme functions values. If is it possible to parallelize searching algorithm and using multithread version of it, it is possible to transform code to CUDA platform and increase speed of this algorithm. At present time is CUDA platform in very fast evolution steps and it shall be used for parallel computing very soon. In test on testing computer was speed up in computing on graphic card more than 35x. Graphic card inside computer contain only 64 CUDA cores. In specific calculation is this graphic card equivalent to 128 cores CPU system. Graphic card with 480 shall be equivalent to 960 cores CPU system. There is only one specific problem of this platform; this platform is not variable as classic CPU platform. It shall be used only for specific process.     

Figure 11 - Processing flow on CUDA

 Conclusion

This paper tried describe speed differences between Wolfram Mathematica 7 as a mathematics program environment and universal C# version 4.0 programing language.  First environment is developed for mathematic operation and have better library of mathematic function and features for mathematic operation with vectors and graphic object. But result of tests showed that speed of these solutions is completely different. Mathematica had better environment for functions but in speed is slow. In opposite side is C# 4.0 universal program environment. Not so easy for coding but it is very fast. In C# is not a complex library with all mathematics function. Program library is small but it is possible to coding missing procedure and functions.

Result of these tests are: for best precision and visualization is best choose in Wolfram Mathematica, but for very fast calculation of values is the best solution application coding in C#. Solution in C# can better used calculation capacity of testing computer system.  

 

Acknowledgement

Publication of this work was supported by the research grant No. IGA/57/FAI/10/D.

 

References

  • HANÁK Ján, Praktické paralelné programovanie v jazykoch C# a C++, Artax 2009
  • HANÁK Ján, Praktické objektové programování v jazyce C# 4.0, Artax 2009
  • Zelinka, Ivan,. Aplikace umělé inteligence /. Vyd. 1. Zlín : Univerzita Tomáše Bati ve Zlíně, 2010.  151 s. : ISBN 978-80-7318-898-6
  • Evoluční výpočetní techniky : principy a aplikace /. 1. vyd. Praha : BEN - technická literatura, 2009.  534 s. : ISBN 978-80-7300-218-3
  • NVIDIA. 2010. CUDA Technology; http://www.nvidia.com/CUDA

 

[1]Použítá verze 4.0

[2]Used in version 4.0

[3]parallel computing architecture developed by NVIDIA

[4]Windows presentation Foundation


Aktuální číslo


Odborný vědecký časopis Trilobit | © 2009 - 2024 Fakulta aplikované informatiky UTB ve Zlíně | ISSN 1804-1795