-
Notifications
You must be signed in to change notification settings - Fork 13
feat: Add StatusMessageWatcher
#407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Update test to be able to use caplog
Refactor tests and code to reuse code.
Update default logger to not duplicate handlers if already exists
Have explicit start and stop instead of __call__
Update test data to properly match the reality with line endings
Naming.
Add final timeout.
Example log of recursive Actor run with highlighted redirected status messages Example Actor: |
e903a68
to
18f4f51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds the new StatusMessageRedirector
utility to stream status and status messages from a target Actor run (both shallow and deep redirection), integrates it into the client APIs, and updates tests to cover the new behavior.
- Introduce
StatusMessageRedirector
(sync & async) inlog.py
- Add
get_status_message_redirector
toRunClient
(sync & async) inrun.py
- Hook status redirection into
ActorClient.call
(sync & async) inactor.py
- Update and extend
test_logging.py
to exercise status message streaming
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
File | Description |
---|---|
tests/unit/test_logging.py | Add fixtures and new tests for StatusMessageRedirector |
src/apify_client/clients/resource_clients/run.py | New get_status_message_redirector methods for sync and async runs |
src/apify_client/clients/resource_clients/log.py | Core StatusMessageRedirector , StatusMessageRedirectorSync/Async |
src/apify_client/clients/resource_clients/actor.py | Wire up status redirector into ActorClient.call sync & async variants |
Comments suppressed due to low confidence (5)
tests/unit/test_logging.py:386
- [nitpick] We should also add a test for the manual
start()
/stop()
usage ofStatusMessageRedirector
(outside of a context manager) to ensure that flow is covered.
@respx.mock
src/apify_client/clients/resource_clients/run.py:349
create_redirect_logger
is used here but not imported; addfrom apify_client._logging import create_redirect_logger
at the top.
to_logger = create_redirect_logger(f'apify.{name}')
src/apify_client/clients/resource_clients/log.py:455
asyncio.create_task
is called butasyncio
is not imported; addimport asyncio
.
self._logging_task = asyncio.create_task(self._log_changed_status_message())
tests/unit/test_logging.py:168
- This fixture mutates a class‐level variable without resetting it, which can leak state across tests; consider using
yield
and restoring the original value after the test.
StatusMessageRedirector._force_propagate = True
src/apify_client/clients/resource_clients/run.py:337
- Docstring says it returns
StatusMessageRedirector
, but in sync/async methods it actually returnsStatusMessageRedirectorSync
/StatusMessageRedirectorAsync
; clarify the concrete return type.
Returns:
`StatusMessageRedirector` instance for redirected logs.
Just a question - are we planning to also show the status message of the callee as the status message of the caller? Or display something like "Waiting for ... to finish"? |
actor_name = actor_data.get('name', '') if run_data else '' | ||
|
||
if not to_logger: | ||
name = '-'.join(part for part in (actor_name, run_id) if part) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure it's a good idea to include the run id by default? The user won't know what it is and it makes the logline kinda messy, especially with nested invocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actor name is for readability and run id is for uniqueness and ability to track the log to it's origin. Keeping only the actor name would make it harder to find the origin of the log.
Depending on the use-case this might seem less or more relevant. Maybe it could be something like f"{actor_name}-runId:{run_id}" to make it more explicit and clear what the id refers to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And in which case do you actually need to track the log to its origin? Isn't the most frequent case calling a single Actor and showing the status so that there's some activity in the log?
Also, a -
may not be an ideal separator, those appear in Actor names quite often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Any scenario where one actor is calling two different runs of another actor. There it would be crucial, but it is not a common use case.
- In normal "one-to-one" scenario it is useful to easily navigate to the relevant run of the called actor. That I do not think is uncommon scenario. Is there any other convenient way how to navigate to the called actor run that would make this information redundant in the logs?
About separator I am fine with anything. In logger it can be even white space I guess. Do you have any preference there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case 1, it would be nice if the user could opt into logging the run id.
In case 2, I'd prefer to just log the run id when starting the Actor. But don't take it as an authoritative answer, I'm just concerned about printing the run id a zillion times on each line.
As a separator, a space sounds like a good idea. Maybe with some arrow or >>
for nested Actor runs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I do not want to add any arguments for customization of the default logger is that it will be part of functions that already have many arguments, for example actor.call . So my idea was, either use default without any customization options, or create your own logger and nothing in between. This gives you all the freedom and does not add too many new arguments to functions that already have too many of them.
What about following idea to reduce the run id spam in the logs. Each redirected logger will first print some full identification message, like: "Redirect logger for actor XYZ and run id ABCDEF... will be shown as {some alias}"
Then we can talk about this alias. Which can be for example "apify.{actor_name} {redirect_logger_counter}" or "apify.redirect {n}" or something else ...?
So it will be shorter and you will still be able to uniquely identify the origin of the redirected log with the usage of the first log message.
BUT!!! Going on with such approach you will not be able to identify deeply nested redirected loggers if you choose an option to not redirect from the start of the actor run (in case of long running standby actors) as you will miss the first redirected logger identification message.
Having that in mind, I still feel that full info in each message is the safest for the default logger, even though it is quite verbose. I will add documentation once it is also integrated into SDK that will show how easy it is to create logger that allows the customization.
check_period: The period with which the status message will be polled. | ||
""" | ||
self._to_logger = to_logger | ||
self._to_logger.propagate = self._force_propagate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not nice to silently reconfigure a logger passed in by an unsuspecting client 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. It should have been reconfigured only if the internal _force_propagate
is changed to True
, which is only the test use-case for being able to use caplog.
Currently this change is only about logging callee stuff. The status message of the caller is already available in the actor run. So if someone wants to duplicate it into logs then I think they can do it by manually both setting the status message and logging the status message, but I would not do it by default for now. |
StatusMessageRedirector
StatusMessageWatcher
Description
Add the option to log status and status messages of another Actor run. If such Actor run is also redirecting logs from another Actor run, then those logs can propagate all the way to the top through the
StreamedLog
- deep redirection. If the Actor run from which we are redirecting status messages is not redirecting logs from its children, then the log and status redirection remains shallow - only from that Actor run.Example usage:
Through
ActorClient(Async).call
By default it is turned on, so just calling actor like this already redirects the logs to pre-configured logger:
To turn off, pass
logger=None
Or pass custom logger, if desired:
Through
RunClient(Async).get_status_message_watcher
With context:
Manually:
Issues