Standardize code for HLS by lforg37 · Pull Request #42 · triSYCL/path_tracer

lforg37 · 2021-04-14T16:55:42Z

Add monostate in front of each variant (#41), replace remaining float with real_t

keryell

There are some ideas but think generic & functional programming! :-)

keryell · 2021-04-15T00:56:54Z

The float_t will allow some fixed point artistic stuff by trying some famous bibliothèques lyonnaises. :-)

Ralender · 2021-04-15T10:35:14Z

-constexpr float infinity = std::numeric_limits<float>::infinity();
-constexpr float pi = 3.1415926535897932385f;
+constexpr real_t infinity = std::numeric_limits<real_t>::infinity();
+constexpr real_t pi = 3.1415926535897932385f;


literals are still assuming float which means that if we change real_t to something else we will get some conversions. maybe make our own user-defined literal could help abstract this.

real_t operator "" _r(long double d) { return d; }

this would force conversion to be on the literals directly such that they can be constant folded

Just use constinit.
I guess a lot of constexpr should be constinit for FPGA. To mention in the poster...

keryell

Great work!
But I am not convinced by the new design.
I cannot see a reason to use this explicit visitor pattern from a pre-modern C++ or Java world when there was no generic lambdas...
You have created a new explicit visitor with all the object-dependent implementation in it breaking the encapsulation by in-lining here all what was in the hit function before.
After looking at the code now with the increased complexity with new kitchen-sink visitor requiring direct access to public object members or über-friendship, I am even less convinced...
Having a distributed hit like before does prevent to add some hierarchy if you have 2 different implementations of sphere for example, by having a hit implemented in some sphere specific class which is inherited by the various spheres.
To avoid blocking this PR I suggest to split it in 2, by moving this complex refactoring in a new one where we can think more about it.
Thanks for this experimentation! It would be interesting if there is any impact on performance and QoR now we have it...

keryell · 2021-04-16T16:37:21Z

+  }
+};
+} // namespace raytracer::visitor
+#endif


Missing end-of-line.
Yes, this is good to have this in its own file.

Curious... An IDE configuration bug?

keryell · 2021-04-16T16:40:03Z

+  point p;         // hit point
+  vec normal;      // normal at hit point
+  bool front_face; // to check if hit point is on the outer surface
+  /*local coordinates for rectangles
+  and mercator coordinates for spheres */
+  real_t u;
+  real_t v;
+
+  // To set if the hit point is on the front face


While you are here, add more Doxygen

Suggested change

point p; // hit point

vec normal; // normal at hit point

bool front_face; // to check if hit point is on the outer surface

/*local coordinates for rectangles

and mercator coordinates for spheres */

real_t u;

real_t v;

// To set if the hit point is on the front face

point p; ///< Hit point

vec normal; ///< Normal at hit point

bool front_face; ///< To check if hit point is on the outer surface

/// Local coordinates for rectangles and mercator coordinates for spheres

real_t u;

real_t v;

/// To set if the hit point is on the front face

lforg37 · 2021-04-16T18:26:25Z

Discussion about the refactoring can be continued on #43

keryell

Functional + generic C++ = Zen++ :-)

keryell

By looking at the code I wonder whether we cannot merge dev_visit(monostate_dispatch into a simpler monostate_visit including the visit in the implementation.

keryell · 2021-05-18T02:18:31Z

Where are we about this?

keryell · 2021-06-09T15:18:05Z

@lforg37 what's up with this?
Probably you should also at least make a WIP branch with the current version we discussed today with the random generator with low spatial quality.

lforg37 · 2021-06-09T18:58:05Z

The current version is based on this branch, commits have been added.
I still need to integrate some suggestion from the review.
I'll "undraft" the PR once its done.

keryell · 2021-06-09T20:14:39Z

+  bool hit(const ray& r, real_t min, real_t max, hit_record& rec,
           material_t& hit_material_type) const {
    hit_record temp_rec;
    material_t temp_material_type;


I have the feeling that if we remove the std::monostate from the material it will always create a sphere here even when there is no intersection.

keryell · 2021-06-09T20:23:00Z

    /// of the ray hitting a smoke particle
    const auto distance_inside_boundary = (rec2.t - rec1.t) * ray_length;
-    const auto hit_distance = neg_inv_density * sycl::log(rng.float_t());
+    auto rng = LocalPseudoRNG { toseed(r.direction()) };


Ah, I realize you have removed the context passed everywhere and replaced basically the random generator with some local computation.
What was the rationale?

As getting a value from the RNG update its internal state, having a shared RNG creates a very long chain of read after write dependency that prevent the HLS compiler to parallelize the otherwise independent computation steps.

Perhaps we could have this context with some special HLS decorations to ignore some dependencies.

For example https://xilinx.github.io/Vitis_Accel_Examples/2020.2/html/dependence_inter.html

Otherwise, perhaps something like this pseudo HLS-SYCL code:

auto rng(auto local_seed = std::source_location::current()) { return [=] { static auto state = local_seed.line(); #pragma HLS dependence variable=state inter false state = crunch(state); return mix(state); }; }

or

auto rng(auto local_seed = std::source_location::current()) { return [state = local_seed.line()] mutable { #pragma HLS dependence variable=state inter false state = crunch(state); return mix(state); }; }

Perhaps we need extension intel/llvm#3746 to have the big picture working.

@aisoard, @yu810226 any feedback?

Above is basically a good idea, if I understand the intention, to separate the state into multiple independent RNGs. I'm not sure what you're trying to achieve with the false dependence pragma, however.

@stephenneuendorffer yes this is the idea to distribute the state here in a lazy way to even initialize the seed with something depending from the source location (here with the line number, we could also use the file name, etc.).
I has put probably too much here with the

#pragma HLS dependence variable=state inter false

as a way to have an II=1 without consuming too much latency at the price of a semantics change in the code and hoping we still have enough randomness...
But I agree this is a second order optimization if we are still in trouble...
Perhaps this code will not work with SYCL-HLS because it uses some function pointer. If so, we can try with a macro to avoid returning the lambda, or perhaps just have a macro which inline a random generator wherever we want one... Ahem... :-)

keryell

I made some comparisons between this version and the main one.
You have not update the README about how to run the program.
It looks like time ./sycl-rt 800 480 50 100 is a way to get the same behaviour as before.
It is a little bit faster than before, I guess because there are some parts of the variant which have been removed...
Also I wonder how it is possible to have the motion blur working without a random generator... by having the time or motion parameter included in the hash? It starts being painful...

keryell · 2021-06-15T14:29:20Z

-  uint32_t shifted4 = (y2 & 63) << 10;
-  uint32_t shifted5 = (z1 & 63) << 5;
-  uint32_t shifted6 = (z2 & 63);
-  return shifted1 ^ shifted2 ^ shifted3 ^ shifted4 ^ shifted5 ^ shifted6;


Ah I read too quickly and was thinking to 63 for 5 bits...
Actually that seems like a good idea then to have an xor. Would more than 6 bits better than? 127 or 255?

keryell

There are still some super old comments and a merge conflict

keryell · 2021-07-05T13:54:14Z

@lforg37 where are we on this?

keryell

Almost there!

keryell · 2021-07-06T12:52:26Z

+  }
+};
+} // namespace raytracer::visitor
+#endif


Curious... An IDE configuration bug?

keryell · 2021-07-06T12:53:07Z

-    auto& rng = ctx.rng;
+  bool scatter(auto& ctx, const ray& r_in, const hit_record& rec, color& attenuation,
+               ray& scattered) const {
+    LocalPseudoRNG rng {  toseed(r_in.direction(), r_in.origin()) };


Suggested change

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

keryell · 2021-07-06T12:53:55Z

-    auto& rng = ctx.rng;
+  bool scatter(auto&, const ray& r_in, const hit_record& rec, color& attenuation,
+               ray& scattered) const {
+    LocalPseudoRNG rng {  toseed(r_in.direction(), r_in.origin()) };


Suggested change

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

keryell · 2021-07-06T12:54:13Z

    // Attenuation of the ray hitting the object is modified based on the color
    // at hit point
-    auto& rng = ctx.rng;
+    LocalPseudoRNG rng {  toseed(r_in.direction(), r_in.origin()) };


Suggested change

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

keryell · 2021-07-06T12:54:49Z

-    auto& rng = ctx.rng;
+  bool scatter(auto& ctx, const ray& r_in, const hit_record& rec, color& attenuation,
+               ray& scattered) const {
+    LocalPseudoRNG rng {  toseed(r_in.direction(), r_in.origin()) };


Suggested change

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

I do not know why I am getting bored here...

keryell · 2021-07-06T13:06:56Z

+  std::memcpy(&x2, &val2.x(), sizeof(uint32_t));
+  std::memcpy(&y2, &val2.y(), sizeof(uint32_t));
+  std::memcpy(&z2, &val2.z(), sizeof(uint32_t));
+  uint32_t shifted1 = x1 << 26;


Use auto everywhere

keryell · 2021-07-06T13:09:00Z

+uint32_t toseed(vec const& val1, vec const& val2) {
+  uint32_t x1, y1, z1, x2, y2, z2;
+  std::memcpy(&x1, &val1.x(), sizeof(uint32_t));
+  std::memcpy(&y1, &val1.y(), sizeof(uint32_t));


Suggested change

std::memcpy(&y1, &val1.y(), sizeof(uint32_t));

std::memcpy(&y1, &val1.y(), sizeof y1);

everywhere.
This is clearer to use the type of the destination. Also it is shorter since you can avoid () with sizeof expression. :-)

keryell · 2021-07-06T13:12:57Z

 #include <algorithm>
 #include <chrono>
 #include <cstdint>
+#include <ctime>


Curious, why?

keryell · 2021-07-06T13:21:10Z

+  @return uint32_t 
+ */
+uint32_t toseed(vec const& val1, vec const& val2) {
+  uint32_t x1, y1, z1, x2, y2, z2;


You can use the C++ version everywhere instead of the C version : std::uint32_t

keryell · 2021-07-06T13:24:06Z

+int main(int argc, char* argv[]) {
+  if (argc < 5 || argc > 7) {
+    std::cerr << "Usage: sycl-rt OUT_WIDTH OUT_HEIGHT DEPTH SAMPLES "
+                 "[SPHERE_INC [RAND_SEED]]"


Update the README to describe this new API

Replace float with real_t

ee32c11

lforg37 requested review from Ralender and keryell April 14, 2021 16:55

keryell requested changes Apr 15, 2021

View reviewed changes

Comment thread include/constant_medium.hpp

Ralender reviewed Apr 15, 2021

View reviewed changes

lforg37 force-pushed the issue41 branch from 6fff7f3 to 3979043 Compare April 16, 2021 09:07

Adding monostate in front of each variant

3979043

lforg37 force-pushed the issue41 branch from 63911e5 to 125e43e Compare April 16, 2021 15:50

keryell reviewed Apr 16, 2021

View reviewed changes

lforg37 force-pushed the issue41 branch from 125e43e to 3979043 Compare April 16, 2021 18:23

lforg37 mentioned this pull request Apr 16, 2021

Refactoring #43

Closed

keryell requested changes Apr 16, 2021

View reviewed changes

Comment thread include/hitable.hpp Outdated

lforg37 force-pushed the issue41 branch from fa67cae to 1893a47 Compare April 20, 2021 11:28

Luc Forget added 2 commits April 20, 2021 04:30

Replace visitor object by function

1893a47

Simplified version

1741496

lforg37 requested a review from keryell April 20, 2021 13:12

keryell reviewed Apr 21, 2021

View reviewed changes

Get rid of context, makeall dimension runtime values

8d3c225

Luc Forget added 2 commits May 25, 2021 06:59

WIP feature restauration

6ceb4ce

Restore Features

60c883a

Get rid of monostate

8a73547

lforg37 marked this pull request as draft June 9, 2021 18:56

keryell reviewed Jun 9, 2021

View reviewed changes

Luc Forget added 2 commits June 11, 2021 00:49

Formatting

4644777

Improving decentralised RNG seed

70c34f9

lforg37 marked this pull request as ready for review June 11, 2021 12:03

lforg37 marked this pull request as draft June 11, 2021 12:05

FPGA friendly local random seed generator

b92d9c2

lforg37 marked this pull request as ready for review June 14, 2021 13:26

keryell reviewed Jun 14, 2021

View reviewed changes

Comment thread include/rtweekend.hpp

Comment thread include/rtweekend.hpp

Comment thread include/rtweekend.hpp Outdated

Comment thread src/main.cpp Outdated

Luc Forget added 3 commits June 15, 2021 01:27

Address review comments

635037d

Restore motion blur

5e96817

Restore image texture and xy_rect

092b972

keryell reviewed Jun 15, 2021

View reviewed changes

keryell requested changes Jun 28, 2021

View reviewed changes

Merging main

8afb9cb

keryell requested changes Jul 6, 2021

View reviewed changes

	LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };
	LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

	std::memcpy(&y1, &val1.y(), sizeof(uint32_t));
	std::memcpy(&y1, &val1.y(), sizeof y1);

Conversation

lforg37 commented Apr 14, 2021

Uh oh!

keryell left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

keryell commented Apr 15, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keryell Apr 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keryell left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lforg37 commented Apr 16, 2021

Uh oh!

keryell left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

keryell left a comment

Choose a reason for hiding this comment

Uh oh!

keryell commented May 18, 2021

Uh oh!

keryell commented Jun 9, 2021

Uh oh!

lforg37 commented Jun 9, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

keryell left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keryell left a comment

Choose a reason for hiding this comment

Uh oh!

keryell commented Jul 5, 2021

Uh oh!

keryell left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

keryell Apr 16, 2021 •

edited

Loading

keryell left a comment •

edited

Loading