A place to discuss everything related to Newton Dynamics.
Moderators: Sascha Willems, walaber
by Julio Jerez » Thu Sep 27, 2018 3:29 pm
ah ok, the two call load.
two things.
the name sse4.2 is another copy and paste bug.
I will change it to see.
the crash is also another copy and paste, I copy the newtonnSse4.2 folder and renamed the chance some of the sse4 intrinsic, fmadd the the equivalent sequence with see.
but I probably missed some others.
it will run on my system because it support them, but your don't therefore it will crash.
I will see what other intrinsic are sse4 specific and change them, and that should fix it.
-
Julio Jerez
- Moderator
-
- Posts: 12249
- Joined: Sun Sep 14, 2003 2:18 pm
- Location: Los Angeles
-
by Julio Jerez » Thu Sep 27, 2018 4:18 pm
um I do not see any intrinsic that isn't allow with SSE
the only suspicious part are this function _mm_add_ps and _mm_cvtss_f32 but I believe these are legal to use, beside Is use in dgVector for horizontal add us ethem.
- Code: Select all
DG_INLINE float AddHorizontal() const
{
__m128 tmp0 (_mm_add_ps (m_low, m_high));
__m128 tmp1 (_mm_hadd_ps (tmp0, tmp0));
__m128 tmp2 (_mm_hadd_ps (tmp1, tmp1));
return _mm_cvtss_f32 (tmp2);
}
can you run with the default solver?
also the trace stack seem for a release build, can you run a debug build? and when is crash can you click show disassembly.
first sync the SEE4 is renamed now.
-
Julio Jerez
- Moderator
-
- Posts: 12249
- Joined: Sun Sep 14, 2003 2:18 pm
- Location: Los Angeles
-
by JoeJ » Fri Sep 28, 2018 12:50 am
Assertion presists (i was always running debug):
- disasm.JPG (198.72 KiB) Viewed 7880 times
-
JoeJ
-
- Posts: 1453
- Joined: Tue Dec 21, 2010 6:18 pm
by JoeJ » Fri Sep 28, 2018 12:54 am
Deleting dx12.dll, 'newtonSSE_d' is now properly displayed, can run with that.
Deletint sse.dll i can run the default solver as well.
-
JoeJ
-
- Posts: 1453
- Joined: Tue Dec 21, 2010 6:18 pm
by JoeJ » Fri Sep 28, 2018 1:01 am
I tried to edit dgVectorSimd.h:
DG_INLINE dgVector operator^ (const dgVector& data) const
{
//return _mm_xor_ps (m_type, data.m_type);
return data.m_type;
}
but nothing changes. The instruction must come from somewhere else.
-
JoeJ
-
- Posts: 1453
- Joined: Tue Dec 21, 2010 6:18 pm
by JoeJ » Fri Sep 28, 2018 1:06 am
Then i edit project options of dgNedwonDx12, set EnableEnhancedInstructionSet to SSE (was AVX)...
Yup, works: 'gpu experimental'
So you could do some benching to see if it is worth to have multiple permutations for the DX plugin...
-
JoeJ
-
- Posts: 1453
- Joined: Tue Dec 21, 2010 6:18 pm
by Julio Jerez » Fri Sep 28, 2018 1:18 am
oh I see the problem now from the assembly in the listing.
the intrinsic with a letter v in from mean they are avx code,
yet another copy and paste from the avx script and editing.
I believe I have it fix now.
Please sync again when you get time.
-
Julio Jerez
- Moderator
-
- Posts: 12249
- Joined: Sun Sep 14, 2003 2:18 pm
- Location: Los Angeles
-
by JoeJ » Fri Sep 28, 2018 2:21 am
Before i try, did you notice the DX12 project settings allow AVX?
JoeJ wrote:Then i edit project options of dgNedwonDx12, set EnableEnhancedInstructionSet to SSE (was AVX)...
Yup, works: 'gpu experimental'
-
JoeJ
-
- Posts: 1453
- Joined: Tue Dec 21, 2010 6:18 pm
by Julio Jerez » Fri Sep 28, 2018 8:19 am
yes that was the bug.
the cmake script now set both to sse2 for 32bit builds and leave it not set for 64 bit builds.
-
Julio Jerez
- Moderator
-
- Posts: 12249
- Joined: Sun Sep 14, 2003 2:18 pm
- Location: Los Angeles
-
by JoeJ » Fri Sep 28, 2018 4:24 pm
Works fine now. I've built all 5 plugIns this time, selection without problems
(It chooses DX and displays GPU name)
-
JoeJ
-
- Posts: 1453
- Joined: Tue Dec 21, 2010 6:18 pm
by Julio Jerez » Fri Sep 28, 2018 5:32 pm
so Joe you were never able to run the plugin ins did you?
did you run the pyramid stack?
is a 100 x 100 pyramid
and state up at 12 iterations not cheats. it even goes to sleep.
-
Julio Jerez
- Moderator
-
- Posts: 12249
- Joined: Sun Sep 14, 2003 2:18 pm
- Location: Los Angeles
-
by JoeJ » Fri Sep 28, 2018 5:55 pm
Julio Jerez wrote:it even goes to sleep.
How long do you wait until it sleeps?
I'm watching since 2-3 minutes, but it's still working hard
Runtime is about 500ms with SSE.
Oh, now the pyramid collapses, looks like timelapsed sand.
Still no sleep after collapse. I'll upload a screenshot - seems our results differ...
-
JoeJ
-
- Posts: 1453
- Joined: Tue Dec 21, 2010 6:18 pm
by Julio Jerez » Fri Sep 28, 2018 7:34 pm
12 iterations seem to be the cut point where its may or may not goes to sleep.
I have noticed some randomness, I think cause by multithreading, that some runs go to sleep an other collapse.
but in the two system I test it when setting iteration count to 16, goes to sleep each time.
I committed at 16 iterations, so wile make a small video so that we have as the reference to compared too.
-
Julio Jerez
- Moderator
-
- Posts: 12249
- Joined: Sun Sep 14, 2003 2:18 pm
- Location: Los Angeles
-
by JoeJ » Sat Sep 29, 2018 2:08 am
I've tried some settings, but even with 1 thread and 20 iterations it does not sleep with SSE plugIn.
After deleting all plugIns:
It does sleep with 1 threads / 20 iter
No sleep with 4 threads / 16 iter
Does sleep with 4 threads / 20 iter
It's really an edge case, e.g. if i change the settings while simulation is running i get different results - need to restart it.
Maybe there is a difference in accuracy for older and newer CPUs? That would explain it.
-
JoeJ
-
- Posts: 1453
- Joined: Tue Dec 21, 2010 6:18 pm
Return to General Discussion
Who is online
Users browsing this forum: No registered users and 1 guest