File references are strings of bytes, that can be encountered in the file_reference
fields of document and photo objects.
They must be cached by the client, along with the origin where the document/photo object was found, in order to be refetched when the file reference expires.
Example implementation of a reference database: MadelineProto, android, telegram desktop, tdlib.
Implementation and maintenance of the file reference database may be fully automated by using the following file reference origin definition file.
Latest file reference origin definition file for the current layer »
First, some definitions:
file_reference
field may appear, for example updateNewMessage.message -> message.media -> messageMediaDocument.document -> document.file_reference
. getMessage{peer: updateNewMessage.message.peer, id: updateNewMessage.message.id}
, which can be used to refetch the document using either messages.getMessages or channels.getMessages depending on the type of the Peer. The definition file contains all possible origins, for all possible file reference paths.
It is automatically generated and validated by the file reference origin generator using a set of manually-specified rules (specifying the actual origins) and the latest API schema, to make sure the following rules are followed:
All possible and valid places where a file_reference
appears have at least one valid associated origin: this is checked by recursively checking all possible deserialized object graphs, starting from:
Update
This covers all possible TL payloads received from the API, which can only return method call responses, or Update constructors contained inside Updates.
For example, when checking the graph of the updateNewMessage constructor, the following file reference paths are found:
updateNewMessage.message -> message.media -> messageMediaDocument.document -> document.file_reference
updateNewMessage.message -> message.reply_to -> messageReplyHeader.reply_media -> messageMediaDocument.document -> document.file_reference
updateNewMessage.message -> message.media -> messageMediaInvoice.extended_media -> messageExtendedMedia.media -> messageMediaDocument.document -> document.file_reference
... and many others
... and for all paths, the system which generates the definition file makes sure that at least one origin covers the path.
All origins must be used in at least one path.
All paths covered by origins which make use of flag fields (which may be absent, leading to an orphan, context-less file reference) must be covered by at least one non-flagged origin (or the flagged origin must be non-flagged in the specified path).
For example, the file reference path updateStory.story -> storyItem.media -> messageMediaDocument.document -> document.file_reference
would rely on the origin getStory{peer: storyItem.from_id, story_id: storyItem.id}
with stories.getStoriesByID.
However, the from_id
field of storyItem is actually a flag and in this specific path it is not set (it's only set when a storyItem is returned by stories.getAllStories).
The validator (the code that generated the definition file, you DO NOT have to implement a validator yourself) noticed that, which forced the manual addition of the valid fallback origin getStory{peer: updateStory.peer, story_id: updateStory.story -> storyItem.id}
, which is present in the final origin definition file.
Note that the definition file is already pre-validated, no additional validation is needed to implement it, the above is just an example of a case that is successfully covered by the validator.
Implementation of a file reference database based on the origin definition file can be done as follows:
The definition file uses the following TL schema:
// Root
fileReferenceOrigins ctxs:Vector<Origin> = FileReferenceOrigins;
origin flags:# predicate:string is_constructor:flags.0?true action:flags.1?ActionOp noop:flags.2?string needs_parent:flags.3?string parent_is_constructor:flags.4?true = origin;
// For string => TypedOp dictionaries
typedOpArg key:string value:TypedOp = TypedOpArg;
// Actions
callOp method:string args:Vector<TypedOpArg> = ActionOp;
getMessageOp flags:# peer:TypedOp id:TypedOp from_scheduled:flags.0?TypedOp = ActionOp;
// Field extraction path
paramNotFlag = ParamFlag;
paramIsFlagAbortIfEmpty = ParamFlag;
paramIsFlagFallback fallback:TypedOp = ParamFlag;
paramIsFlagPassthrough = ParamFlag;
pathPart flags:# constructor:string param:string flag:ParamFlag = PathPart;
path parts:Vector<PathPart> = Path;
pathParent parts:Vector<PathPart> = Path;
// Typed constructors, the type is specified to simplify codegen,
// but isn't strictly necessary as it can be inferred from the TypedOpOp.
// It is fully pre-validated during the generation of the definition file.
typedOp type:string op:TypedOpOp = TypedOp;
copyOp path:Path = TypedOpOp;
copyFromParentOp path:Path = TypedOpOp;
getInputChannelByIdOp path:Path = TypedOpOp;
getInputUserByIdOp path:Path = TypedOpOp;
getInputPeerOp path:Path = TypedOpOp;
getInputUserOp path:Path = TypedOpOp;
getInputChannelOp path:Path = TypedOpOp;
getStickerSetFromDocumentAttributesOp path:Path = TypedOpOp;
// Literals & constructors (methods not allowed or needed here)
constructorOp constructor:string args:Vector<TypedOpArg> = TypedOpOp;
vectorOp values:Vector<TypedOp> = TypedOpOp;
intLiteralOp value:int = TypedOpOp;
longLiteralOp value:long = TypedOpOp;
stringLiteralOp value:string = TypedOpOp;
boolLiteralOp value:Bool = TypedOpOp;
doubleLiteralOp value:double = TypedOpOp;
themeFormatLiteralOp = TypedOpOp;
Here's a detailed description of the constructors.
Note: The definition file assumes that all Updates constructors have already been converted to a vector of Update constructors, including short variants like updateShortMessage, updateShortSentMessage, updateShortChatMessage, which must be pre-converted to Update constructors by the client using information extracted from the method call.
While this operation could be done within the file reference origin definition file, it would needlessly increase the number of paths: given that most clients already convert short constructors to Update constructors, the file reference origin definition file only considers paths starting from the Update constructors.
fileReferenceOrigins ctxs:Vector<Origin> = FileReferenceOrigins;
origin flags:# predicate:string is_constructor:flags.0?true action:flags.1?ActionOp noop:flags.2?string needs_parent:flags.3?string parent_is_constructor:flags.4?true = origin;
The definition file is composed of a single fileReferenceOrigins
constructor, which contains a list of origin
constructors.
Each origin
represents a constructor or method file reference origin:
predicate
- Indicates the name of the constructor or of the method where extraction of the origin fields must start.is_constructor
- If set, predicate
points to a constructor, otherwise it points to a method.needs_parent
- If set, contains the name of a constructor which needs to appear as a parent in the deserialized object or a method whose response we're deserializing, as it will be used by one or more of the paths in the action.parent_is_constructor
- If set, needs_parent
points to a constructor; otherwise it points to a method.action
- Optional: contains the method that needs to be invoked to refresh the reference, when needed. noop
- Optional: contains a human-readable description as to why should this origin be ignored. Exactly one of the mutually exclusive action
, noop
flags must be set.
If the noop
flag is set, this origin should be ignored completely (including during codegen): noop origins are used internally to make sure all file reference paths are still covered in some way during validation, including paths for ephemeral media like inline results, or for media without any associated origin (for example, media uploaded using messages.uploadMedia but not yet sent anywhere obviously does not have any associated origin).
If the action
flag is set and the is_constructor
flag is set, when deserializing a constructor with predicate equal to predicate
, contained in an incoming Update or in at any depth in a method call response, do the following:
needs_parent
is set but we don't have a constructor or method of the appropriate type in our parents.file_reference
s to all origins of all types currently on the stack.ActionOp
can be evaluated (all required flag fields are set, all paths can be correctly extracted; the check can be done during evaluation, aborting the evaluation on error), and if yes, evaluate and commit the origin to the database.needs_parent
is set but we don't have a constructor or method of the appropriate type in our parents.If the action
flag is set and the is_constructor
flag is not set, when deserializing the response of the method with name equal to predicate
, do the following:
originName
to the origin stack.file_reference
s to all origins of all types currently on the stack.ActionOp
can be evaluated (all required flag fields are set, all paths can be correctly extracted; the check can be done during evaluation, aborting the evaluation on error), and if yes, evaluate and commit the origin to the database. Note that method origins cannot make use of needs_parent
.
// Actions
callOp method:string args:Vector<TypedOpArg> = ActionOp;
getMessageOp flags:# peer:TypedOp id:TypedOp from_scheduled:flags.0?TypedOp = ActionOp;
// For string => TypedOp dictionaries
typedOpArg key:string value:TypedOp = TypedOpArg;
// Action parameters
// Typed constructors, the type is specified to simplify codegen,
// but isn't strictly necessary as it can be inferred from the TypedOpOp.
// It is fully pre-validated during the generation of the definition file.
typedOp type:string op:TypedOpOp = TypedOp;
Actions are stored (associated to one or more file references) after the deserialization of a method or constructor (as specified above »), and executed when one of the file references expire (i.e. a FILE_REFERENCE_EXPIRED
RPC error is returned when using it).
The arguments are composed of a set of typedOp
constructors.
typedOp
» is a wrapper for a TypedOpOp
constructor which also contains the TL type
of the associated TypedOpOp
; this isn't strictly necessary for evaluation, but it can be useful during automatic code generation from the definition file.
callOp
callOp
is a generic action which invokes the method specified in method
with the arguments specified in args
.
callOp.args
will always contain at least all of the required parameters, and possibly some flagged parameters as well.
getMessageOp
getMessageOp
is a specialized action which invokes either messages.getMessages or channels.getMessages depending on the type of the peer
, passing the id
as the only element to the vector id
parameter.
If the from_scheduled
flag is present and set, and the flag to which the path points is also set, messages.getScheduledMessages should be executed instead.
copyOp path:Path = TypedOpOp;
copyFromParentOp path:Path = TypedOpOp;
getInputChannelByIdOp path:Path = TypedOpOp;
getInputUserByIdOp path:Path = TypedOpOp;
getInputPeerOp path:Path = TypedOpOp;
getInputUserOp path:Path = TypedOpOp;
getInputChannelOp path:Path = TypedOpOp;
getStickerSetFromDocumentAttributesOp path:Path = TypedOpOp;
// Literals & constructors (methods not allowed or needed here)
constructorOp constructor:string args:Vector<TypedOpArg> = TypedOpOp;
vectorOp values:Vector<TypedOp> = TypedOpOp;
intLiteralOp value:int = TypedOpOp;
longLiteralOp value:long = TypedOpOp;
stringLiteralOp value:string = TypedOpOp;
boolLiteralOp value:Bool = TypedOpOp;
doubleLiteralOp value:double = TypedOpOp;
themeFormatLiteralOp = TypedOpOp;
Action parameters are represented by TypedOpOp
constructors.
copyOp
The most commonly used type, copies the value(s) at the specified path ».
copyFromParentOp
Copies the value(s) at the specified path », starting from the constructor/method specified in origin.needs_parent
.
getInputChannelByIdOp
Returns an InputChannel constructor from the client's peer database, based on the channel ID of type long specified in path
.
getInputUserByIdOp
Returns an InputUser constructor from the client's peer database, based on the channel ID of type long specified in path
.
getInputPeerOp
Transforms and returns the Peer constructor to which path
points into an InputPeer constructor.
getInputUserOp
Transforms and returns the User constructor to which path
points into an InputUser constructor.
getInputChannelOp
Transforms and returns the Channel constructor to which path
points into an InputChannel constructor.
getStickerSetFromDocumentAttributesOp
Takes the Vector<DocumentAttribute>
to which path
points, looks for a documentAttributeSticker, and returns the InputStickerSet contained in documentAttributeSticker.stickerset
; aborts if there is no attribute of type documentAttributeSticker in the passed vector.
constructorOp
Constructs the constructor of type (predicate) constructor
using the arguments specified in args
.
vectorOp
Constructs a vector of the constructors passed in values
.
intLiteralOp
Constructs a literal int with the value passed in value
.
longLiteralOp
Constructs a literal long with the value passed in value
.
stringLiteralOp
Constructs a literal string with the value passed in value
.
boolLiteralOp
Constructs a literal Bool with the value passed in value
.
doubleLiteralOp
Constructs a literal double with the value passed in value
.
themeFormatLiteralOp
Constructs a string, indicating the theming engines supported by the client (used when working with theme-related media, can be an empty string if the client doesn't support themes).
paramNotFlag = ParamFlag;
paramIsFlagAbortIfEmpty = ParamFlag;
paramIsFlagFallback fallback:TypedOp = ParamFlag;
paramIsFlagPassthrough = ParamFlag;
pathPart flags:# constructor:string param:string flag:ParamFlag = PathPart;
path parts:Vector<PathPart> = Path;
pathParent parts:Vector<PathPart> = Path;
Paths are used by action parameters to extract a parameter from one or more constructors, i.e. updateStory.story -> storyItem.media -> messageMediaDocument.document -> document.file_reference
.
The first part of the path always points to:
origin
(origin.predicate
), for path
origin.needs_parent
, for pathParent
A path is composed of multiple pathPart
s.
Each pathPart
contains the following fields, which describe how to extract the field.
constructor
- Indicates the required constructor/method predicate.
If a different constructor type is encountered (i.e. documentEmpty instead of document), abort extraction.param
- Indicates the required parameter; if it's an empty string, it indicates the return value of a method. flag
- Contains exactly one of the following constructors:paramNotFlag
- The current parameter is not a flagparamIsFlagAbortIfEmpty
- The current parameter is a flag, and if it's not set, abort extraction. paramIsFlagFallback
- The current parameter is a flag, and if it's not set, use the specified TypedOp
as fallback value. paramIsFlagPassthrough
- The current parameter is a flag, and its value should be copied/returned verbatim; can only be used on the last element of a path and within the arguments of a constructorOp/callOp/getMessageOp only if the argument that uses this path is a flag of the same type. Assume you receive a message from your friend: that message contains a messageMediaPhoto with a photo.
Your client has to cache not only the file_reference
field of the photo, but also the context in which the file reference was seen (in this case, a message coming from a specific user).
The context info is in this case, an origin of type message, containing the message ID and the peer ID of the chat/channel/user where the message was seen.
The context info has to be associated with the file reference: when downloading a file using upload.getFile or resending it using messages.sendMedia, a FILE_REFERENCE_EXPIRED
error may be returned.
messages.sendMultiMedia returns a variation of the same error, as a FILE_REFERENCE_%d_EXPIRED
error (where %d
is the index of the media with the expired file reference in the passed media array).
If this happens, the context info must be used to refetch the object that contained the file reference: in this example, the peer info and the message ID have to be used with channels.getMessages or messages.getMessages to refetch the message, recache the file reference and use it in a new file download request.
More than one origin can be associated to one file reference, for greater resilience (in the case of a message that was deleted in one chat but was also forwarded in another chat, the file reference can be refetched from the second chat, instead).
Origins for objects returned by method calls with certain parameters can be considered, too (for example, in the case of favorited sticker sets returned by messages.getFavedStickers).