Skip to content

Conversation

@linh2931
Copy link
Contributor

@linh2931 linh2931 commented Apr 7, 2025

Impements sync call trace. Each individual sync call has a sync_call_trace struct containing all information about the sync call. An action trace has a vector of sync_call_trace for all sync calls happened within the call. The relationship between sync calls are identified by ordinal and sender_ordinal.

console field and condole debug prints will be implemented in a separate PR.

SHiP support is tracked by #1334

Resolves #1216

A sample sync call traces for

 [
  {
    "action_ordinal": 1,
    "creator_action_ordinal": 0,
    "closest_unnotified_ancestor_action_ordinal": 0,
    "receipt": {
      "receiver": "caller",
      "act_digest": "4169266343f52db133bded246813966fb493bf53a4c16ecea999ece1a6ca7f9f",
      "global_sequence": 48,
      "recv_sequence": 1,
      "auth_sequence": [
        [
          "caller",
          3
        ]
      ],
      "code_sequence": 1,
      "abi_sequence": 1
    },
    "receiver": "caller",
    "act": {
      "account": "caller",
      "name": "doit",
      "authorization": [
        {
          "actor": "caller",
          "permission": "active"
        }
      ],
      "data": ""
    },
    "context_free": false,
    "elapsed": 618,
    "console": "before calling callee1before calling callee2",
    "trx_id": "9189869e2c41413775edbaf1721d036dce527ab576a4e883a874d47f9f39bd2b",
    "block_num": 11,
    "block_time": "2020-01-01T00:00:05.000",
    "account_ram_deltas": [],
    "return_value": "",
    "call_traces": [
      {
        "call_ordinal": 1,
        "sender_ordinal": 0,
        "receiver": "callee1",
        "read_only": false,
        "data": "",
        "elapsed": 299,
        "console": "I am callee1",
        "console_markers": [
          12
        ],
        "return_value": "4920616d2063616c6c656531"
      },
      {
        "call_ordinal": 2,
        "sender_ordinal": 1,
        "receiver": "callee11",
        "read_only": false,
        "data": "4920616d2063616c",
        "elapsed": 145,
        "console": "I am callee11",
        "console_markers": [],
        "return_value": "4920616d2063616c6c65653131"
      },
      {
        "call_ordinal": 3,
        "sender_ordinal": 0,
        "receiver": "callee2",
        "read_only": false,
        "data": "",
        "elapsed": 115,
        "console": "I am callee2",
        "console_markers": [],
        "return_value": "4920616d2063616c6c656532"
      }
    ],
    "console_markers": [
      22,
      44
    ]
  }
]

A sample for exception propagation:

[
  {
    "action_ordinal": 1,
    "creator_action_ordinal": 0,
    "closest_unnotified_ancestor_action_ordinal": 0,
    "receiver": "caller",
    "act": {
      "account": "caller",
      "name": "doit",
      "authorization": [
        {
          "actor": "caller",
          "permission": "active"
        }
      ],
      "data": ""
    },
    "context_free": false,
    "elapsed": 677,
    "console": "",
    "trx_id": "9189869e2c41413775edbaf1721d036dce527ab576a4e883a874d47f9f39bd2b",
    "block_num": 11,
    "block_time": "2020-01-01T00:00:05.000",
    "account_ram_deltas": [],
    "except": {
      "code": 3050003,
      "name": "eosio_assert_message_exception",
      "message": "eosio_assert_message assertion failure",
      "stack": [
        {
          "context": {
            "level": "error",
            "file": "cf_system.cpp",
            "line": 14,
            "method": "eosio_assert",
            "hostname": "",
            "thread_name": "unit_test",
            "timestamp": "2025-04-14T21:00:04.716"
          },
          "format": "assertion failure with message: ${s}",
          "data": {
            "s": "eosio_assert by callee11"
          }
        },
        {
          "context": {
            "level": "warn",
            "file": "host_context.cpp",
            "line": 121,
            "method": "execute_sync_call",
            "hostname": "",
            "thread_name": "unit_test",
            "timestamp": "2025-04-14T21:00:04.716"
          },
          "format": "sync call exception ${receiver} <= ${sender} console output: ${console}",
          "data": {
            "receiver": "callee11",
            "sender": "callee1",
            "console": ""
          }
        },
        {
          "context": {
            "level": "warn",
            "file": "host_context.cpp",
            "line": 121,
            "method": "execute_sync_call",
            "hostname": "",
            "thread_name": "unit_test",
            "timestamp": "2025-04-14T21:00:04.716"
          },
          "format": "sync call exception ${receiver} <= ${sender} console output: ${console}",
          "data": {
            "receiver": "callee1",
            "sender": "caller",
            "console": ""
          }
        },
        {
          "context": {
            "level": "warn",
            "file": "apply_context.cpp",
            "line": 124,
            "method": "exec_one",
            "hostname": "",
            "thread_name": "unit_test",
            "timestamp": "2025-04-14T21:00:04.716"
          },
          "format": "${receiver} <= ${account}::${action} console output: ${console}",
          "data": {
            "console": "",
            "account": "caller",
            "action": "doit",
            "receiver": "caller"
          }
        }
      ]
    },
    "error_code": "10000000000000000000",
    "return_value": "",
    "call_traces": [
      {
        "call_ordinal": 1,
        "sender_ordinal": 0,
        "receiver": "callee1",
        "read_only": false,
        "data": "",
        "elapsed": 478,
        "console": "",
        "console_markers": [
          0
        ],
        "except": {
          "code": 3050003,
          "name": "eosio_assert_message_exception",
          "message": "eosio_assert_message assertion failure",
          "stack": [
            {
              "context": {
                "level": "error",
                "file": "cf_system.cpp",
                "line": 14,
                "method": "eosio_assert",
                "hostname": "",
                "thread_name": "unit_test",
                "timestamp": "2025-04-14T21:00:04.716"
              },
              "format": "assertion failure with message: ${s}",
              "data": {
                "s": "eosio_assert by callee11"
              }
            },
            {
              "context": {
                "level": "warn",
                "file": "host_context.cpp",
                "line": 121,
                "method": "execute_sync_call",
                "hostname": "",
                "thread_name": "unit_test",
                "timestamp": "2025-04-14T21:00:04.716"
              },
              "format": "sync call exception ${receiver} <= ${sender} console output: ${console}",
              "data": {
                "receiver": "callee11",
                "sender": "callee1",
                "console": ""
              }
            },
            {
              "context": {
                "level": "warn",
                "file": "host_context.cpp",
                "line": 121,
                "method": "execute_sync_call",
                "hostname": "",
                "thread_name": "unit_test",
                "timestamp": "2025-04-14T21:00:04.716"
              },
              "format": "sync call exception ${receiver} <= ${sender} console output: ${console}",
              "data": {
                "receiver": "callee1",
                "sender": "caller",
                "console": ""
              }
            }
          ]
        },
        "error_code": "10000000000000000000",
        "return_value": ""
      },
      {
        "call_ordinal": 2,
        "sender_ordinal": 1,
        "receiver": "callee11",
        "read_only": false,
        "data": "0000000000000000",
        "elapsed": 305,
        "console": "",
        "console_markers": [],
        "except": {
          "code": 3050003,
          "name": "eosio_assert_message_exception",
          "message": "eosio_assert_message assertion failure",
          "stack": [
            {
              "context": {
                "level": "error",
                "file": "cf_system.cpp",
                "line": 14,
                "method": "eosio_assert",
                "hostname": "",
                "thread_name": "unit_test",
                "timestamp": "2025-04-14T21:00:04.716"
              },
              "format": "assertion failure with message: ${s}",
              "data": {
                "s": "eosio_assert by callee11"
              }
            },
            {
              "context": {
                "level": "warn",
                "file": "host_context.cpp",
                "line": 121,
                "method": "execute_sync_call",
                "hostname": "",
                "thread_name": "unit_test",
                "timestamp": "2025-04-14T21:00:04.716"
              },
              "format": "sync call exception ${receiver} <= ${sender} console output: ${console}",
              "data": {
                "receiver": "callee11",
                "sender": "callee1",
                "console": ""
              }
            }
          ]
        },
        "error_code": "10000000000000000000",
        "return_value": ""
      }
    ],
    "console_markers": [
      0
    ]
  }
]

@linh2931 linh2931 requested review from heifner and spoonincode April 7, 2025 19:48
@linh2931 linh2931 linked an issue Apr 7, 2025 that may be closed by this pull request
std::optional<fc::exception> except;
std::optional<uint64_t> error_code;
std::vector<char> return_value;
std::optional<std::vector<sync_call_trace>> call_traces;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why an optional vector? I'm guessing you don't want call_traces to appear in JSONified trace if an action didn't use sync calls at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what I want. Should we show call_traces with an empty "" instead if an action did not make any sync calls? @arhag

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't an empty vector be sufficient to unambiguously indicate the action didn't use sync calls at all?

I don't see the harm of always including the call_traces in the JSON output of the action trace with it being [] if no sync calls were made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you both. I will change to use plain vector.


uint32_t ordinal = 1;
const uint32_t sender_ordinal = 0;
const account_name sender;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is sender redundant information since you have sender_ordinal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put sender here for simple reference. But it can be found from sender_ordinal. Will remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string console;
std::optional<fc::exception> except;
std::optional<uint64_t> error_code;
int64_t return_value_size_or_error_id = 0; // if >=0: return value size, if -1: receiver not supporting sync calls
Copy link
Contributor

@spoonincode spoonincode Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value size is redundant information since it's already implicitly part of the return_value. I am not sure how to best handle the validity of this and other items in the trace. It almost seems like really we have a variant made up of

  • contract was runnable: elapsed, console, except,error_code, return_value
  • contract was not runnable error_id (i.e. reason it wasn't runnable)

But using a variant here may be quite the departure from precedent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was struggling about this too. I would replace return_value_size_or_error_id with an optional error_id. If a contract is runnable, this field is not presented. If a contract is not runnable, this field shows up with reasons., while elapsed, console, except, error_code, and return_value are empty.

const uint32_t sender_ordinal = 0;
const account_name sender;
const account_name receiver;
const uint64_t flags = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make any sense to "unwrap" the flags in the user visible trace? For example instead of a consumer just seeing flags: 1 or flags: 2 and needing to know what the magic numbers are, they'd see for example

   "read_only": false,
   "some_other_flag": true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I will unwrap it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One consideration is, if we use new bits in the future, we will need to add them to the call trace. But that should be OK as it is a protocol change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true... I suppose it would be most disruptive for SHIP since it would require a whole new sync_call_trace_vx whenever a new flag is added.

Copy link
Contributor

@spoonincode spoonincode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a few comments on structure of user visible trace that has been nagging me -- I'm not really sure what the correct solution is for some of these. Especially whether there is precedence to lean in to for inconsistency or not.

@spoonincode
Copy link
Contributor

btw couldn't find any issue/task for adding the traces to trace_api_plugin, though the design doc does say that traces will be added there.

@linh2931
Copy link
Contributor Author

btw couldn't find any issue/task for adding the traces to trace_api_plugin, though the design doc does say that traces will be added there.

Thanks. I forgot to create one. Here is now #1348

@linh2931
Copy link
Contributor Author

Adding a few comments on structure of user visible trace that has been nagging me -- I'm not really sure what the correct solution is for some of these. Especially whether there is precedence to lean in to for inconsistency or not.

I will add some comments about the new struct and its fields to help users.

uint32_t ordinal = 1;

// sender's ordinal. If the caller is the action, sender_ordinal is 0.
const uint32_t sender_ordinal = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have noticed this sooner -- I think we'd want to make these varints (fc::unsigned_int). Ordinals in action_trace are also varints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will change.

struct console_marker {
uint32_t call_ordinal = 0;
uint32_t console_position = 0;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably varint for these too

// Identify the starting position of a sync call in the call's console log.
// This is used for pretty printing console logs.
struct console_marker {
uint32_t call_ordinal = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When discussed on call I thought we came to the conclusion we only needed the position? i.e. type could be vector<unsigned_int> console_markers and the size of console_markers is the number of sync calls performed I think1. Can we still do it that way or was there a challenge with it?

Footnotes

  1. I seem to remember us saying number of sync calls minus 1 but that isn't making sense to me immediately 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought using the struct was more clear. But will change to one dimensional vector to save space, as this is to be used by SHiP. n-1 should be OK as we know the start and end.

Copy link
Contributor Author

@linh2931 linh2931 Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, when I started to change the code, it seemed to me we need n instead of n-1. Did I miss something?

If we have n sync calls, we should have n points to indicate where in the console we start the calls.

If we make only 1 sync call, I think we still need 1 marker, not 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are correct @linh2931.

}
}
} FC_RETHROW_EXCEPTIONS( warn, "${receiver} <= ${account}::${action} pending console output: ${console}", ("console", _pending_console_output)("account", act->account)("action", act->name)("receiver", receiver) )
} FC_RETHROW_EXCEPTIONS( warn, "${receiver} <= ${account}::${action} console output: ${console}", ("console", trx_context.get_action_trace( action_ordinal ).console)("account", act->account)("action", act->name)("receiver", receiver) )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to have a get_current_action_trace() or such, since this trx_context.get_action_trace( action_ordinal ) is used all over. Obviously not critical to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I renamed get_root_action_trace() (I did not like the name when I initially named it) with get_current_action_trace(), and replaced direct use of trx_context.get_action_trace( action_ordinal ) with get_current_action_trace(). The code looks nicer.

26fbb7b

// As early as possible, create the call trace of this new sync call in the parent's
// (sender's) trace to record entire trace of the sync call, including any exceptions
auto& trace = get_root_action_trace();
const bool read_only = flags & static_cast<uint64_t>(sync_call_flags::read_only);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an earlier PR we discussed using has_field() not sure how much it matters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the last element all_allowed_bits in the enum sync_call_flags is not a flag (or field in a sense). I did not use the has_field mechanism. Will revisit this later.

auto& call_trace = get_call_trace(ordinal);
call_trace.error_id = -1;
finalize_call_trace(call_trace, start);
trx_context.checktime();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not strictly necessary -- since no new WASM was run in this case, the expiration handlers weren't touched and we didn't run a risk of missing an expiration. Fine to keep as is since we may need to take more of a look on how expiration works in the future

// TBD store the info in sync call trace
ilog("receiver_account->code_hash empty");
return return_value_size;
return handle_call_failure();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case the console marker will not be created. Seems like creation of console marker should be moved way up earlier in the function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it here thinking if we could not make a sync call for any reasons before creating the sync call context, we should not put the sync call position in the markers, since no call was made.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No call was made but you added a trace for the call still. That means it becomes impossible to use the markers properly because the number of markers will not equal the number of sync call traces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I will put them together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// store return value
last_sync_call_return_value = std::move(call_ctx.return_value);
return_value_size = last_sync_call_return_value.size();
} catch( const wasm_exit&) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not new to this PR but I realize here that if a contract sets return value then eosio_exits it looks like the return value is lost. Worse then that, highly likely there is a JIT vs OC difference here. Can make new issue for this but we need to track it some how if not fixed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class sync_call_context : public host_context {
public:
sync_call_context(controller& con, transaction_context& trx_ctx, account_name sender, account_name receiver, bool privileged, uint32_t sync_call_depth, uint64_t flags, std::span<const char> data);
sync_call_context(controller& con, transaction_context& trx_ctx, uint32_t oridnal, action_trace& root_action_trace, account_name sender, account_name receiver, bool privileged, uint32_t sync_call_depth, uint64_t flags, std::span<const char> data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling on ordinal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spoonincode
Copy link
Contributor

I think these 3 should be addressed but otherwise LGTM
#1333 (comment)
#1333 (comment)
#1333 (comment)

@spoonincode
Copy link
Contributor

Also probably want to update the PR description with the new trace format. But there is plenty other documentation that will need to be updated one day too

…direct use of trx_context.get_action_trace( action_ordinal ) with get_current_action_trace()
… number of markers matches with the number of call traces
@linh2931
Copy link
Contributor Author

Also probably want to update the PR description with the new trace format. But there is plenty other documentation that will need to be updated one day too

I updated the PR description with the new trace format. It looks quite nice!

Yes, we will write good documentation when the feature is done.

@linh2931
Copy link
Contributor Author

I think these 3 should be addressed but otherwise LGTM #1333 (comment) #1333 (comment) #1333 (comment)

They have been addressed. Thanks!

} catch(...) {}

mvo("call_traces", act_trace.call_traces);
mvo("console_markers", act_trace.console_markers);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right to me. The abi_serializer job is to produce the data in clean JSON. This will be a list of ints which doesn't seem that easy to understand. It is not clear to me what would be better. Maybe insert [[call_trace identifer - BEGIN]] [[call_trace identifer - END]] into the console output and remove the console_markers output.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about this in multiple daily standups and afaict it matches what was agreed upon? If the JSON (not what cleos pretty prints) needs to be easier to manually grok we should discuss it again and come up with a new approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, we did discuss this. Seeing it in practice makes me think it is less than ideal. But the alternative of parsing the JSON console string is not great either. I withdraw my comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only immediate thought I have is to change the console logs to be a vector<variant<string, varint>>. So the effective output is something like

"console": ["console output before a sync call", 1, "some stuff between two sync calls", 2, "after doing a couple sync calls"]

The numbers being the sync call ordinals.

Now of course if you output vector<variant<string, varint>> through standard variant mechanism it won't look like that. However, we're already "manually" crafting the trace's JSON for ABI stuff! So it isn't a stretch to go ahead and collapse the variant to the appropriate JSON types so it does look that nice as above. This does make the outputted JSON impossible to ingest through ABI again though, but I can't see any immediate reason that's a problem?

Of course the traces in ship would need to remain a proper variant<string, varint> which does introduce an inconsistency, at least technically. But ship never does ABI deserialization either so it's already not going to be exactly what the trace was from the node.

idk if it's worth it -- but I do believe it would improve readability some.

We also would still have the challenge of what to do about current console in action trace (the current approach is nice in that it doesn't touch it at all)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure it is worth it. But that could also be:

"console": ["console output before a sync call", "some stuff between two sync calls", "after doing a couple sync calls", "", "after another sync call with no logging in between"]

Then it would be representable in ABI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine as is.

I would rather not change the existing console field to not break any consumers that currently expect it as a string.

@linh2931 linh2931 merged commit e4a8372 into sync_call Apr 23, 2025
36 checks passed
@linh2931 linh2931 deleted the call_trace branch April 23, 2025 23:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SC: Add sync_call traces to call traces

4 participants