CN111711865A

CN111711865A - Method, apparatus and storage medium for outputting data

Info

Publication number: CN111711865A
Application number: CN202010621914.5A
Authority: CN
Inventors: 马志强
Original assignee: Zhejiang Tonghuashun Intelligent Technology Co Ltd
Current assignee: Zhejiang Tonghuashun Intelligent Technology Co Ltd
Priority date: 2020-06-30
Filing date: 2020-06-30
Publication date: 2020-09-25

Abstract

The embodiment of the application discloses a method, equipment and a storage medium for outputting data, wherein the method comprises the following steps: under the condition of playing the first video, judging whether at least one piece of associated data related to a current playing picture of the first video can be obtained or not; if yes, obtaining at least one piece of associated data related to a current playing picture of the first video; outputting at least part of the at least one associated data; wherein the at least part of the associated data can be output in an audio or video mode, and the associated data is generated by a viewer aiming at the current playing picture. Thus, automatic output of relevant data such as hearts, experiences, comments and the like is realized, and targeted output and interesting output of the relevant data are realized.

Description

Method, apparatus and storage medium for outputting data

Technical Field

The present application relates to data output technologies, and in particular, to a method, an apparatus, and a storage medium for outputting data.

Background

When a user wants to mentally record and comment on the content in one of the play pictures when watching a video, the user usually needs to input the comment from the comment area, and when the barrage is opened, the barrage content can be output along with the play pictures. Besides the presentation of the user's hearts record and comment in the bullet screen mode, the screenshot can be put into the comment area and presented together with the hearts record, comment and other contents by adopting the screen capture mode. In the above two presentation manners, the bullet screen content is usually presented in the whole video playing process, and which bullet screen content is made by the user for which picture in the video, and is not known. The method of screenshot on the screen requires that a user firstly captures the screen, and inputs a heartfelt record or comment content aiming at the screenshot in the comment area and loads the screenshot into the comment area, so that excessive participation of the user is required, and the intelligence is insufficient. In addition, in the above two methods, information such as a user's mind or comment on the content of the playback screen cannot be output in time, and the information transmission efficiency is low.

Disclosure of Invention

To solve the above problems, the present invention provides a method, an apparatus, and a storage medium for outputting data.

In a first aspect, an embodiment of the present application provides a method for outputting data, where the method includes:

in the case where the first video is played,

determining whether at least one associated data related to a currently playing picture of the first video is available;

if yes, obtaining at least one piece of associated data related to a current playing picture of the first video;

outputting at least part of the at least one associated data;

wherein the at least part of the associated data can be output in an audio or video mode, and the associated data is generated by a viewer aiming at the current playing picture.

In the foregoing solution, before obtaining at least one associated data related to the current playing picture, the method includes:

acquiring at least one audio data and/or at least one video data generated by at least one viewer aiming at the current playing picture;

determining the collected audio data and video data as related data related to the current playing picture;

and correspondingly storing the associated data and the identification information of the current playing picture.

In the foregoing solution, the obtaining at least one associated data related to the current playing picture includes:

acquiring identification information of the current playing picture;

and determining the associated data corresponding to the identification information of the current playing picture as the associated data related to the current playing picture from the associated data and the identification information of the playing picture which are stored correspondingly.

In the above scheme, the method further comprises:

detecting a first preset operation generated by the viewer aiming at the current playing picture;

obtaining an operation attribute of a first preset operation;

presenting a collection function key under the condition that the operation attribute meets a preset condition; the acquisition function key is used for acquiring audio data and/or video data generated by a viewer aiming at the current playing picture;

and detecting the operation of the viewer on the acquisition function key, and acquiring at least one audio data and/or at least one video data generated by the viewer aiming at the current playing picture.

In the above scheme, the collection function keys include a first collection function key and a second collection function key; the first acquisition function key is used for acquiring audio data generated by the viewer aiming at the current playing picture; the second acquisition function key is used for acquiring video data generated by the viewer aiming at the current playing picture;

under the condition that the operation aiming at the first function key is detected, starting an audio acquisition device, and acquiring audio data generated by the viewer aiming at the current playing picture by the audio acquisition device;

or,

and under the condition that the operation aiming at the second function key is detected, starting a video acquisition device, and acquiring video data generated by the viewer aiming at the current playing picture by the video acquisition device.

In the foregoing solution, after obtaining at least one associated data related to the current playing picture, the method further includes:

presenting prompt data, wherein the prompt data is prompt data aiming at the at least part of the associated data and is used for prompting that the at least part of the associated data can be output;

detecting a second predetermined operation aiming at the prompt data, wherein the second predetermined operation is an operation for outputting the at least part of the associated data;

generating a first instruction based on the second predetermined operation, wherein the first instruction is an instruction for outputting the at least part of the associated data;

responding to a first instruction, and obtaining an output type of the at least part of the associated data; the output type is characterized as a type which is output by audio or video;

outputting the at least partially associated data in the output type.

In the foregoing solution, when the obtained associated data is two or more, the method further includes:

obtaining respective prompt data for respective associated data;

determining the presentation position of each prompt datum on the display screen;

and displaying each prompt datum at the corresponding presentation position.

In a second aspect, an embodiment of the present application provides an apparatus for outputting data, where the apparatus includes:

a playing unit for playing the first video;

the judging unit is used for judging whether at least one piece of associated data related to the current playing picture of the first video can be obtained or not, and if so, the obtaining unit is triggered;

an obtaining unit configured to obtain at least one associated data related to a currently playing picture of a first video;

an output unit, configured to output at least part of the associated data in the at least one associated data;

the output unit outputs at least part of the associated data in an audio or video mode, wherein the associated data is generated by a viewer aiming at the current playing picture.

In a third aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the foregoing method.

In a fourth aspect, an embodiment of the present application provides an apparatus for outputting data, including:

one or more processors;

a memory communicatively coupled to the one or more processors;

one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the methods described above.

In the embodiment of the application, the automatic output of the associated data such as hearts, experiences, comments and the like is realized, and the intelligence can be embodied. And the associated data of a certain playing picture is output under the condition that the picture image is played by the video, so that the output associated data can be known as the data of which picture image in the video the user makes, the watching experience is improved, and the targeted output of the associated data is realized. And the related data is output in an audio or video mode, so that interesting output is realized.

Drawings

Fig. 1 is a first flowchart illustrating a method for outputting data according to an embodiment of the present disclosure;

FIG. 2 is a second flowchart illustrating a method for outputting data according to an embodiment of the present disclosure;

FIG. 3 is a third flowchart illustrating a method for outputting data according to an embodiment of the present disclosure;

FIG. 4 is a fourth flowchart illustrating a method for outputting data according to an embodiment of the present disclosure;

5-8 are diagrams of several application operating interfaces according to embodiments of the present application;

9-12 are several representations of associated data according to embodiments of the present application;

fig. 13 is a schematic structural diagram of a device for outputting data according to an embodiment of the present application;

fig. 14 is a schematic hardware configuration diagram of an apparatus for outputting data according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In some of the flows described in the specification and claims of the present application and in the above-described figures, a number of operations are included that occur in a particular order, but it should be clearly understood that the flows may include more or less operations, and that the operations may be performed sequentially or in parallel.

In practical applications, each playing picture of a video is actually an image, and for contents expressed by each image, there are cases where a user makes a mental experience annotation or record, comment, etc. on the contents of the image. In the embodiment of the application, the user can automatically display the heartfelt or comment of the user on a certain picture under the condition of playing the video, and the user watching the video can timely know the heartfelt of other users on the picture, so that the information transmission efficiency is improved. The following describes the technical solution of the embodiment of the present application in detail.

The embodiment of the application provides a method for outputting data, and the method can be applied to equipment for outputting data. The device may be any device capable of outputting at least one of audio, image and video data. Such as cell-phone, computer, notebook, TV set, server, intelligent wearable equipment such as intelligent wrist-watch, intelligent glasses etc.. For convenience of the subsequent description, the device that outputs data is simply referred to as a target device.

Fig. 1 is a first embodiment of a method for outputting data in the embodiment of the present application, and as shown in fig. 1, the method includes:

s101: under the condition of playing the first video, judging whether at least one piece of associated data related to a current playing picture of the first video can be obtained or not;

if yes, continuing to execute S102;

if the judgment result is no, continuing to execute S104;

in this step, the first video may be any one of videos that can be viewed by the viewer, such as a teaching video, an entertainment video, and the like. The playback screen is any one of screen images in the playback video viewed by the viewer. The current playing picture is an image currently played by the video. In practical applications, the associated data may specifically be data of hearts, experiences, comments, and the like generated by a viewer (user) viewing the content of the currently played picture.

It is understood that not all picture images in the played video have associated data related thereto, and it is possible that some picture images have associated data related thereto, and some picture images do not. The associated data related to the playing screen can be regarded as data such as hearts, experiences, comments and the like generated by the user aiming at the content of the playing screen.

In this step, it is determined whether or not the associated data relating to the screen image can be obtained for the currently played screen of the video. It is equivalent to whether data such as hearts, experiences, and comments of the user with respect to the currently played screen exists. If the judgment result shows that the current state exists, continuing to execute S102; otherwise, S104 is performed.

S102: obtaining at least one associated data related to a current playing picture of the first video, and continuing to execute S103;

s103: outputting at least part of the at least one associated data; wherein the at least part of the associated data can be output in an audio or video mode, and the associated data is generated by a viewer aiming at the current playing picture.

If the current playing picture in S101 to S103 is regarded as the nth (N is a positive integer greater than or equal to 1) playing picture of the first video, after S101 to S103 are executed, or it is determined that at least one piece of associated data related to the nth playing picture of the first video cannot be obtained or cannot be obtained, S104 is continuously executed;

s104: and updating N, if the updated N is equal to N +1, substituting the updated N into S101 to continue execution, so as to finish playing the video.

The above-mentioned scheme is equivalent to the scheme of executing the steps S101 to S104 once for each playing image of the playing video, and the video playing is directly completed. When a video is played with a playing image containing data such as hearts, experiences and comments of a user, at least part of the data of the hearts, experiences and comments is obtained and output. Therefore, automatic output of associated data such as hearts, experiences, comments and the like is realized, automatic output of the associated data can be realized without excessive participation of users, and intelligence can be embodied. And the associated data of a certain playing image is output under the condition that the image is played by the video, so that the output associated data can be known as the data of which picture in the video the user makes, the watching experience is improved, and the targeted output of the associated data is realized. And the associated data is output in an audio or video mode, so that the problems of outputting bullet screen contents in a text mode and outputting the bullet screen contents in a fixed and single mode in the related technology can be avoided. And the related data is output in an audio or video mode, so that interesting output is realized. In addition, aiming at the current playing picture of the video, the information such as the hearts, the experiences or the comments of the user aiming at the content of the playing picture can be automatically output in time, the hearts, the experiences or the comments of other users aiming at the content of the picture can be timely conveyed to the user watching the current playing picture, and the information conveying efficiency can be improved.

The main body executing S101 to S104 is a device (target device) that outputs data.

The reason why the output of at least part of the associated data is limited in S103 is that: aiming at a certain playing picture of a video, a plurality of users may generate hearts, experiences or comments aiming at the playing picture, and under the condition that the number of hearts, experiences or comments is large, in order to avoid the situation that the display screen cannot completely output all hearts, experiences or comments aiming at the playing picture, in the embodiment of the application, at least part of hearts, experiences, comments and the like in the data of the hearts, experiences, comments and the like are output. Quantitatively, a heart, an experience or a comment generated by the visual user for the playing picture is regarded as one associated data.

As can be seen from the foregoing technical solutions, the video is played from beginning to end, the solutions shown in S101 to S104 are performed for each playing screen of the video, and if a certain playing screen in the video can obtain the associated data related to the playing screen, at least part of the associated data is output. If the associated data relating to the play-out picture is not available or not available, the play-out of the next picture is continued. In practical applications, associated data related to a certain playing frame can be displayed in the playing frame of the video playing. When the related data is displayed, the playback screen on which the user generated the related data may be played back continuously or stopped, and preferably, the playback is stopped. The display time of the associated data may be a time during which a play screen for which the associated data is generated by the user starts to be displayed in the entire video until the video ends. The display of the associated data may be continued from the time when the play screen for which the associated data is generated by the user starts to be displayed in the entire video until the play of the play screen ends. The display time period may be set in advance, and the associated data may be displayed in the set display time period. Of course, any other reasonable display situations are also included, and the details are not repeated because enumeration is impossible.

Fig. 2 is a second embodiment of the data output method in the embodiment of the present application, and as shown in fig. 2, the method includes:

s201: during the process of playing a first video, acquiring at least one audio data and/or at least one video data generated by at least one viewer aiming at least part of playing pictures of the first video;

s202: determining the collected audio data and video data as related data related to the at least part of the playing picture;

s203: and correspondingly storing the associated data and the identification information of the corresponding playing picture in the at least part of playing pictures.

The foregoing S201 to S203 can be regarded as a process of how to obtain the associated data. The method mainly collects data of hearts, experiences, comments and the like spoken in an audio mode or a video mode generated by watching a certain playing picture of a video by a user. And under the condition that the data such as the hearts, the experiences or the comments, which are generated by the user in the audio mode or the video mode aiming at a certain playing picture, are collected, correspondingly storing the playing picture and the data such as the hearts, the experiences or the comments, which are generated by the user in the audio mode or the video mode aiming at the playing picture. In a specific implementation, the playing node information of the playing picture in the whole video, such as the playing time information, or the playing picture image itself, is used as the identification information of the playing picture, and the identification information of the playing picture and the data of the mood, the experience or the comment spoken by the user in an audio mode or a video mode, which are generated for the playing picture, are correspondingly stored. Illustratively, for video play time 01: 30, the user speaks the mood data aiming at the picture content in an audio mode, collects the mood data, and converts the playing time 01: and (30) correspondingly recording the pictures of the heart and the acquired data. It can be understood that the associated data obtained by the above scheme can better reflect the real feeling generated when the user watches a certain playing picture, and is greatly convenient for the user. And the user only needs to speak out the self feeling in a video or audio mode without inputting texts, so that the collection of the real feeling information of the user is facilitated, and the intelligence is improved. And the real feeling and the playing picture aimed at by the real feeling are correspondingly stored, which can provide certain convenience for the scheme of subsequently obtaining the associated data related to the playing picture and outputting the associated data.

It will be appreciated that in practical applications, there may be more than one user that is happy or experienced with respect to a certain same play scene in a video. In the embodiment of the application, the hearts or the experiences of the users aiming at the same playing picture are all stored according to the scheme. Thus, for the identification information of the same playing picture, a plurality of hearts or experience data may be stored corresponding to the identification information. That is, the storage of the associated data in the embodiment of the present application will store each user who generates a heartfelt experience or an experience of the same playing picture, and the heartfelt experience or the experience spoken by the user in a video or audio manner.

S204: if it is determined that at least one associated data related to the currently played picture of the first video is available in the case of subsequently playing the first video, continuing to execute S205;

s205: acquiring identification information of the current playing picture;

s206: and determining the associated data corresponding to the identification information of the current playing picture as the associated data related to the current playing picture from the associated data and the identification information of the playing picture which are stored correspondingly.

S207: outputting at least part of the at least one associated data.

The main body executing S201 to S207 is a device (target device) that outputs data. If it is determined that at least one piece of associated data related to the currently played frame of the first video cannot be obtained or cannot be obtained, refer to the foregoing S104, and the repetition is not repeated.

In the solutions shown in S201 to S203, in the process of playing a video, the hearts or experiences spoken by each user in the form of video or audio and the identification information of the playing picture, which are generated by each user for the corresponding content of the playing picture, can be correspondingly stored, so that the accuracy of information storage can be ensured. In the solutions shown in S204 to S207, from the stored associated data and the identification information of the playing picture, the associated data related to the current playing picture is determined according to the identification information of the current playing picture, and at least part of the associated data related to the current playing picture is output. The associated data related to the current playing picture is determined according to the identification information of the current playing picture, so that the determination accuracy of the associated data can be ensured.

In S204 to S207, the determination as to whether or not at least one piece of associated data relating to the currently played screen of the first video is available for a (currently) played screen of the currently played video corresponds to the determination as to whether or not associated data relating to the currently played screen of the first video is stored, and if the determination is that associated data relating to the currently played screen of the first video is available, the determination is made as to whether or not associated data relating to the currently played screen of the first video is stored. Obtaining the identification information of the current playing picture, such as reading the playing time information of the current playing picture in the whole video or the picture image, and searching or searching the associated data from the stored data according to the identification information of the current playing picture. And the data corresponding to the identification information of the current playing picture, which is searched or searched from the stored data, is the related data related to the current playing picture. And outputting, such as displaying, at least part of the searched or searched associated data.

In the foregoing solution, for a certain playing picture of a video, if there are multiple data of hearts, experiences, or comments of the same user for the playing picture, or data of hearts, experiences, or comments of different users for the same playing picture, under the condition of acquiring the data of hearts, experiences, or comments for the same playing picture, it is further necessary to record the time for acquiring each data of hearts, experiences, or comments for the same playing picture, that is, the acquisition time. And under the condition that the display screen cannot completely output all hearts, experiences or comments aiming at the same playing picture, outputting according to the recorded sequence of the acquisition time. For example, if the number of heartburn, appreciation or comment data for the currently played picture is found to be 15 at most through search or search from the saved data, and the number of associated data exceeding the currently allowable display of the display screen is 5, the recorded acquisition time of these data is read. And displaying according to the sequence of the acquisition time. If the 15 associated data are sorted according to the time sequence, the associated data sorted into the first to the fifth are displayed firstly, the associated data sorted into the sixth to the tenth are displayed secondly, and the associated data sorted into the 11 th to the 15 th are displayed finally. Equivalently, the 15 associated data are divided into three groups according to the time sequence, it can be understood that the display time of each group of data in the three groups may all be equal to the playing time of the current playing picture, may also all be equal to 1/3 of the playing time of the current playing picture, and may also be displayed at the end of video playing from the playing time point of the current playing picture, which is not specifically limited.

In an alternative, as shown in fig. 3, S201: during the playing of the first video, the acquisition of at least one audio data and/or at least one video data generated by at least one viewer for at least a part of a playing picture of the first video can be further realized by the following scheme:

s2011: detecting a first preset operation generated by the viewer aiming at the current playing picture in the process of playing a first video;

in this step, the first predetermined operation may be any reasonable operation, such as at least one of a click operation, a press operation, a slide operation, and a voice operation. Preferably a pressing operation or a sliding operation. The first predetermined operation may be a press operation or a slide operation that is generated on the currently played screen by a viewer (user) in a case where the video being viewed is played as the currently played screen.

S2012: obtaining an operation attribute of a first preset operation;

in this step, if the first predetermined operation is a pressing operation, the operation attribute may be a pressing time length and/or a pressing position on the display screen. If the first predetermined operation is a slide operation, the operation attribute is a slide distance, a slide position on the display screen (start position to end position), and/or a slide gesture. If the first predetermined operation is a click operation, the operation attribute may be a click frequency. If the first predetermined operation is a voice operation, the operation attribute may be to recognize a semantic meaning of the voice data.

S2013: presenting a collection function key under the condition that the operation attribute meets a preset condition; the acquisition function key is used for acquiring audio data and/or video data generated by a viewer aiming at the current playing picture;

in this step, if the first predetermined operation is a pressing operation, the pressing duration is longer than the preset duration and/or the pressing position is located at a preset position, such as a center position of the display screen, may be regarded as an operation attribute satisfying a predetermined condition. If the first preset operation is a sliding operation, the sliding distance is larger than the preset distance, and/or the sliding position on the display screen is parallel sliding from the left side (the left short side of the display screen) of the display screen to the right side (the right short side of the display screen) of the display screen at a distance of 3cm from the edge of the display screen, or the sliding gesture is a preset gesture such as a gesture of a letter "Z", and is regarded as an operation attribute meeting a preset condition. And if the first preset operation is a clicking operation, the clicking frequency meeting the preset frequency is regarded as an operation attribute meeting the preset condition. If the first predetermined operation is a voice operation, identifying an operational attribute of the voice data that is semantically "capturing audio and/or video data" may be considered an operational attribute that satisfies the condition. The predetermined operation, the operation attribute and the predetermined condition can be flexibly set according to the actual application situation, and are not described in detail because enumeration is impossible.

And presenting a function key (capture function key) for capturing audio data and/or video data generated by the viewer for the current playing picture in the case that the operation attribute satisfies a predetermined condition. The capture function key can be presented on the currently played picture, and can also be presented on a place which is not covered by the currently played picture on the display screen, such as the upper left corner position of the display screen.

S2014: and detecting the operation of the viewer on the acquisition function key, and acquiring at least one audio data and/or at least one video data generated by the viewer aiming at the current playing picture.

In this step, if the user has his own hearts, experiences or comments with respect to the currently played picture, the acquisition function key is operated, such as touch, pressing, sliding, and the like, the target device detects the operation of the user on the acquisition function key, and if the user who operates the acquisition function key is considered to have a demand for entering the hearts, experiences or comments, the operation is responded, and at least one audio data and/or at least one video data generated by the viewer with respect to the currently played picture are acquired.

The main body executing S2011 to S2014 is a device (target device) that outputs data.

In the foregoing S2011 to S2014, the capture function key is provided, and when there is a heart, an experience, or comment data entered by the user, the capture function key is presented according to an operation generated by the user for the current playing screen and an operation attribute of the operation, so that the capture function key can be presented when needed. And based on the detected operation of the viewer on the acquisition function key, at least one audio data and/or at least one video data generated by the viewer aiming at the current playing picture are acquired, so that the automatic acquisition or entry of the hearted, experienced or commented data is realized, and the intelligence is improved. And the data of hearts, experiences or comments automatically collected or recorded are audio data or video data, so that the problem of boring watching caused by the fact that the data of hearts, experiences or comments is text data can be avoided.

Referring to fig. 5, in order to show the handover rate teaching lesson video with a playing time of 5 minutes, if the video is played to a position of 3 minutes and 20 seconds during the playing process of the video, the user feels that the content of the playing picture (currently playing picture) of the 3 minutes and 20 seconds of the video is an important point for explaining the concept of the handover rate, and the important point is to be identified from the whole video. And if the pressing time length generated by the pressing operation is judged to be greater than the preset time length, the operation attribute is considered to meet the preset condition. A capture function key, such as function key 1 in fig. 5, is presented on the currently played picture, specifically, at the center position of the currently played picture. The user generates operations such as touch, pressing, sliding and the like on the acquisition function key, the target device detects the above operations of the user on the acquisition function key, and the user who operates the acquisition function key is considered to have the need of entering hearts, experiences or comments. And acquiring data such as hearts, experiences or comments spoken by the user aiming at the current playing picture, and acquiring data such as hearts, experiences or comments which are input or acquired in an audio mode aiming at the current playing picture. Or acquiring data such as hearts, experiences or comments expressed by the user in a video mode aiming at the current playing picture, and acquiring data such as hearts, experiences or comments input or acquired in a video mode aiming at the current playing picture. The system realizes automatic acquisition or entry of the data of the hearts, the experiences or the comments, and improves the intelligence. And the data of hearts, experiences or comments automatically collected or recorded are audio data or video data, so that the problem of boring watching caused by the fact that the data of hearts, experiences or comments is text data can be avoided.

In the foregoing solution, the acquisition function key may be used to acquire or enter data such as hearts, experiences, or comments spoken by the user for the currently played picture, or may also be used to acquire or enter data such as hearts, experiences, or comments expressed by the user in a video manner for the currently played picture. Whether the acquisition function key is used for acquiring or inputting audio data or video data can be determined according to the operation of the target equipment for identifying the acquisition function key generated by the user. Illustratively, different operations generated on the acquisition function key can be preset to realize the acquisition or entry function of audio data and video data. For example, the function of collecting or inputting audio data is realized by setting the pressing operation generated on the collection function key, and the function of collecting or inputting video data is realized by the touch control generated on the collection function key, such as clicking operation. The target device detects the operation of the user on the acquisition function key, considers that the user who operates the acquisition function key needs to enter hearts, experiences or comments, further detects whether the operation of the user on the acquisition function key is a pressing operation or a clicking operation, and if the operation is the pressing operation, the user considers that the audio data needs to be acquired or entered. And if the operation is a click operation, the video data is considered to be required to be collected or recorded. Therefore, the audio data or the video data can be acquired or recorded respectively through the same key and the acquisition function key.

In an optional aspect, the capture function keys include a first capture function key and a second capture function key; wherein, the first collecting function key is used for collecting (or inputting) audio data generated by the viewer aiming at the current playing picture; the second acquisition function key is used for acquiring (or inputting) video data generated by the viewer aiming at the current playing picture; under the condition that the operation aiming at the first function key is detected, starting an audio acquisition device, and acquiring audio data generated by the viewer aiming at the current playing picture by the audio acquisition device; or, in the case of detecting the operation of the second function key, starting the video capture device, and capturing the video data generated by the viewer for the current playing picture by the video capture device. Here, unlike the above-described capture function key that can capture or enter audio data and video data when the first predetermined operation is generated by the user for the currently played screen and the operation attribute of the operation satisfies the predetermined condition, in this alternative, when the operation attribute of the first predetermined operation generated by the user for the currently played screen satisfies the predetermined condition, two function keys, a first function key and a second function key, may be presented. And if the first function key is regarded as the function key for collecting or inputting the audio data, the second function key is regarded as the function key for collecting or inputting the video data, and vice versa. Taking the first function key as a function key for collecting or inputting audio data, taking the second function key as a function key for collecting or inputting video data as an example, when an operation for the first function key is detected, an audio collecting device such as a microphone is started, and the microphone collects audio data generated by a user for a currently played picture. And under the condition that the operation aiming at the second function key is detected, starting a video acquisition device such as a camera, and acquiring video data generated by a user aiming at the current playing picture by the camera. The acquisition or the entry of the data of the audio frequency and the data of the video frequency can be realized by using different function keys, and the acquisition or the entry accuracy of the audio frequency and the video frequency can be ensured.

Referring to fig. 6, in order to show the handover rate teaching lesson video with a playing time of 5 minutes, if the video is played to a position of 3 minutes and 20 seconds during the playing process of the video, the user feels that the content of the playing picture (currently playing picture) of the 3 minutes and 20 seconds of the video is an important point for explaining the concept of the handover rate, and the important point is to be identified from the whole video. The user generates a sliding operation on the current playing picture, the target device detects the sliding operation, the size relation between the sliding distance generated by the sliding operation and the preset distance is judged, and if the sliding distance generated by the sliding operation is judged to be greater than the preset distance, the operation attribute is considered to meet the preset condition. Two capture function keys, such as the function key 11 (first function key) and the function key 12 (second function key) in fig. 6, are presented on the currently played screen. The function key 11 is used for collecting or inputting the data of the hearts, the experiences or the comments of the audios. The function keys 12 are used for capturing or entering data of hearts, experiences or comments of videos. The user performs operations such as touch, press, slide, and the like on the function key 11, and the target device detects the user's operations on the function key 11 as above, and considers that the user has a need to enter audio hearts, experiences, or comments. Turning to fig. 7 from fig. 6, as shown in fig. 7, the microphone is turned on, and the user speaks his/her own mental, experience or comment data for the content of the currently played picture, for example, "this is very good, and summarizing, the hand-off rate is an index parameter proportional to the stock's trading price in the stock industry," the microphone collects these data as associated data for the currently played picture, and the collected data is stored in correspondence with the identification information of the currently played picture. It is understood that, in the illustration of fig. 7, the duration of a single audio acquisition is 60s, that is, each time the microphone acquires data of 60s at the longest acquisition. The duration of this 60s will indicate to the user that the duration of a single acquisition is slowly shortening by the gradual shortening of the bar images. How long time remains in the duration of 60s can also be indicated by the length of the bar image. Of the 6 bar-shaped images shown in fig. 7, the longest bar-shaped image represents a time length of 60s, the second shortest bar-shaped image represents a time length of 20s, and the shortest bar-shaped image represents a time length of 10 s. And when the target equipment starts the microphone, counting down the single acquisition time length, and if 20s of acquisition time length is left, controlling the short strip-shaped image to flash. And if the acquisition time length of 10s is left, controlling the shortest strip-shaped image to flash so as to prompt the user how long the next time is left. The function key 13 in fig. 7 is a cancel function key, and the function key 14 is a confirm/complete function key. If the target device detects the operation of the function key 13 by the user in the recording process, the acquisition or the recording is cancelled. The completion of recording is confirmed if the operation of the function key 14 by the user is detected. And reading the playing time information of the current playing picture, taking the time information as the identification information of the current playing picture, and correspondingly storing the identification information and the recorded audio hearts, experiences or comments.

The user generates operations such as touch, press, slide and the like on the function key 12, the target device detects the user's operations on the function key 12 as above, and the user is considered to have a need for entering a video mood, experience or comment. Turning to fig. 8 from fig. 6, as shown in fig. 8, the camera is turned on, the user presses the function key 15 to record a video, the camera collects a picture of the user speaking his mind, experience or comment data and a speaking content, such as "recommending people to watch it with emphasis here, it mainly expresses that the hand-off rate is an important index parameter in the stock industry and can change along with the change of the trading price of the stock", the camera collects these data as associated data for the currently played picture, and correspondingly stores the collected data and the identification information of the currently played picture. According to the scheme, different function keys are utilized to collect or record the data of the audio and the data of the video, so that the audio and video related data can be accurately obtained.

It is understood that fig. 5-8 are merely specific examples provided for ease of understanding and do not limit all implementations of embodiments of the present application. Any reasonable way in practical application is covered under the scope of the embodiments of the present application.

Fig. 4 is a fourth embodiment of the data output method in the embodiment of the present application, and as shown in fig. 4, the method includes:

s401: under the condition of playing the first video, judging whether at least one piece of associated data related to a current playing picture of the first video can be obtained or not;

s402: if yes, obtaining at least one piece of associated data related to a current playing picture of the first video;

s403: presenting prompt data, wherein the prompt data is prompt data aiming at least part of the at least one piece of associated data and is used for prompting that the at least part of associated data can be output;

s404: detecting a second predetermined operation aiming at the prompt data, wherein the second predetermined operation is an operation for outputting the at least part of the associated data;

s405: generating a first instruction based on the second predetermined operation, wherein the first instruction is an instruction for outputting the at least part of the associated data;

s406: responding to a first instruction, and obtaining an output type of the at least part of the associated data; the output type is characterized as a type which is output by audio or video;

s407: outputting the at least partially associated data in the output type.

The main body executing S401 to S407 is a device (target device) that outputs data. If it is determined that at least one piece of associated data related to the currently played frame of the first video cannot be obtained or cannot be obtained, refer to the foregoing S104, and the repetition is not repeated.

Unlike the scheme of directly presenting the related data shown in fig. 1 to 3, in S401 to S407, when it is determined that at least one piece of related data related to the currently playing screen of the first video can be obtained, at least one piece of related data related to the currently playing screen of the first video is obtained, the cue data is presented, and when a predetermined operation (second predetermined operation) for the cue data is detected, an instruction (first instruction) for outputting the at least part of related data is generated, and in response to the first instruction, an output type of the at least part of related data is obtained, and output of the at least part of related data is performed in the output type of the at least part of related data. The technical scheme provided by the application is equivalent to that the presentation of the associated data is not directly performed, but the prompt data for prompting the output of at least part of the associated data is presented first, and the associated data is output according to the output type of the associated data when the user operates the prompt data. The actual output requirements of the user can be met, and the user experience is improved.

In an optional aspect, in the case that the association data obtained in S402 is two or more, the method further includes: obtaining respective prompt data for respective associated data; determining the presentation position of each prompt datum on the display screen; and displaying each prompt datum at the corresponding presentation position. In other words, if there are two or more pieces of related data on the same play screen of a video, the presentation data corresponding to each piece of related data exists. In order to avoid the overlapping of the prompt data when the prompt data are displayed on the display screen, the display positions of the prompt data on the display screen are distributed, different display positions of the prompt data are distributed, and the prompt data are displayed on the corresponding display positions. Therefore, the user can know that the currently played picture of the video has a plurality of associated data (the number of the associated data is consistent with that of the prompt data) through the prompt data displayed on the display screen, and convenience is brought to the user.

In an optional scheme, the prompt data at least comprises at least partial text data of each associated data in the at least partial associated data; the text data is obtained by performing text recognition on at least part of the associated data. In this alternative, in order to play a corresponding prompting role of different prompting data, text recognition may be performed on audio data in the collected data when the audio collection device or the video collection device collects associated data generated by the user for the playing picture, which is equivalent to performing text recognition on what the user says, what the user feels or what the user says is, and the recognized text is taken as a part of the prompting data, and is presented together when the prompting data is presented. It is to be understood that the recognized text may be text recognition that is performed word-wise with respect to what the user speaks, experiences, or reviews, may also recognize a primary meaning, and may also recognize a portion. In the presentation, in consideration of the limitation of the presentation position, text data capable of expressing a main meaning is generally presented, and text data of the first few words spoken by the user may be presented. Therefore, the user can conveniently know the content expressed by each associated data, and the use experience of the user is improved.

The following prompt data at least includes identification information of a user who generates a heartburn, an experience or a comment, such as a user avatar, a type of associated data (audio or video), a play duration of the associated data, and the like, and further includes: the text data of the first few words spoken by the user is taken as an example, and the scheme of the fourth embodiment is described with reference to fig. 9-12.

In the application scenario, it is assumed that, during the process of viewing the handover rate teaching course video, the user 1 and the user 2 both generate ideas of labeling the content emphasis of the playing picture for the 3 rd, 20 th second playing picture. Through fig. 5-8, respective minds for the play screen contents of 3 minutes and 20 seconds are realized, such as that of the user 1, which is "very good here, summarizing that the hand-off rate is an index parameter in the stock industry that can be proportional to the trading price of the stock". The user 2 is happy that "the recommendation is watched with emphasis, which means that the hand-off rate is an important index parameter in the stock industry and can change with the change of the trading price of the stock". The mind of the user 1 is entered by means of audio. The mind of the user 2 is entered by means of video. The mind of the user 1 and the identification information of the playing picture of the 3 rd minute and 20 th second are correspondingly stored. The mind of the user 2 and the identification information of the playing picture of the 3 rd minute and 20 th second are correspondingly stored.

In the process that the user 3 watches the hand-changing rate teaching course video, when the video is played to the playing picture of the 3 rd to 20 th second, whether the associated data related to the playing picture of the 3 rd to 20 th second is stored is judged, if the associated data is judged to be stored, the playing time information of the playing picture of the 3 rd to 20 th second in the whole video is read, and the associated data is searched or searched according to the playing time information of the playing picture of the 3 rd to 20 th second in the stored data. The data corresponding to the 3 rd minute 20 second play screen searched or searched out from the stored data is the related data related to the 3 rd minute 20 second play screen. In the present application scenario 1, it is assumed that the heartburn data of the user 1 and the heartburn data of the user 2 are searched or searched for the 3 rd minute and 20 th second playback screen. And reading prompt data set for each mental data, distributing presentation positions of each prompt data on the display screen, and presenting the prompt data on the display screen at the respective presentation positions. Presentation positions of the cue data generated for the play screen of the 3 rd minute 20 th second by the user 1, the cue data generated for the play screen of the 3 rd minute 20 th second by the user 2, and the respective cue data are shown with reference to fig. 9. Each prompt data includes information such as a user avatar, the type of associated data (audio or video), the playing time of the associated data, and text data of the first few words spoken by the user. One of the cueing data is presented to the left of the display as shown in figure 9, the cueing data presented to the left also including the text data "good here". Another reminder data is presented to the right of the display screen, the reminder data presented to the right also comprising the text data "here recommend important viewing".

Upon detecting a user-directed audio play button, for prompt data presented to the left of the display screen

In the case of the operation (second predetermined operation), the step goes from fig. 9 to fig. 10 to perform the soundAnd (5) frequency playing. Specifically, when the operation of the user on the audio play key is detected, an instruction for outputting the related data generated by the user 1 for the 3 rd, 20 th play screen, specifically, an audio output instruction (first instruction) is generated, the output type is determined to be output in an audio manner in response to the audio output instruction, and the audio play is performed on the mental data generated by the user 1 for the 3 rd, 20 th play screen.

Aiming at prompt data presented at the right side of the display screen, detecting that a user aims at a video playing key

In the case of the operation (second predetermined operation), the video playback is performed by jumping from fig. 9 to fig. 11. Specifically, when the operation of the user on the video playback key is detected, an instruction for outputting the related data generated by the user 2 for the 3 rd, 20 th playback screen, specifically, a video output instruction (first instruction) is generated, the output type is determined to be output in a video manner in response to the video output instruction, and video playback is performed on the mental data generated by the user 1 for the 3 rd, 20 th playback screen. In a technical aspect, to implement video playing of the mental data generated by the user 2 for the 3 rd, 20 th second playing picture, a video playing window may be created on the basis of the currently playing picture of the video, and the mental data generated by the user 2 for the 3 rd, 20 th second playing picture is played in the video playing window. The size of the video playing window and the display position on the display screen can be preset. As shown in fig. 11, the video of the mental data generated by the user 2 for the 3 rd minute 20 th second playing picture is a video with a time length of 2 minutes, the video is played from the beginning, and the user can operate the pause key during the playing of the video

To stop the playing of the video. Can also operate the window closing key

The closing of the window is performed.

It will be understood by those skilled in the art that if the heartburn data generated by the user 2 for the 3 rd minute 20 second play picture is recorded in a video manner, only the audio data of the recorded video data may be played in addition to the video. In the case where an operation (second predetermined operation) of the video play key by the user is detected, as in jumping from fig. 9 to fig. 12, the audio in the video is played from the beginning, and during the playing of the audio, the pause key is operated by the user

To stop the playing of the audio.

In practical application, in the process of playing the associated data, the video of the rate change teaching course stops playing, so as to avoid confusion. The mental data generated by each user for the 3 rd minute 20 second playing picture is hidden or disappeared after the respective playing is completed. And then, continuously playing the hand-changing rate teaching course video.

In the foregoing solution, if the heartburn data shown in fig. 10 or 11 is generated by the user a for the play screen of the 3 rd minute and 20 th second, the prompt data shown in fig. 10 or 11 may be dragged on the display screen. Only the hearts, experiences or comment data recorded by the user a are displayed in the dragging process. If the user a generates a plurality of hearts of data for the 3 rd minute and 20 th second playing picture, that is, the prompt data is a plurality of pieces of data, the prompt data is not overlapped in the dragging process. And if the dragged prompt data is dragged to a blank, displaying the prompt data in the blank. In practical applications, an upper limit of the number of simultaneously presented reminder data should be set in advance, for example, 10, in consideration of the limited size of the display screen. And/or presetting the same playing picture content aiming at the same video, only allowing to collect or input 10 pieces of mental data, and not collecting or inputting other mental data aiming at the same playing picture content. Thus, the definition and effectiveness of the presented prompt data can be ensured.

The scheme has at least the following advantages:

1) aiming at a certain playing picture of a video, under the condition that data such as hearts, experiences or comments generated by a user aiming at the playing picture exist, the data are automatically obtained and automatically output, so that the automatic output of the associated data is realized, and the intelligence is embodied;

2) the user watching the video at present can know which users make hearts, experiences or comments aiming at which picture in the video, so that the watching experience is improved, and the targeted output of the associated data is realized;

3) and the associated data is output in an audio or video mode, so that the dryness of the text data is avoided, and the interesting output is realized.

4) Aiming at a certain playing picture of the video, information such as hearts, experiences or comments of other users aiming at the content of the playing picture can be automatically output in time, the hearts, experiences or comments of the other users aiming at the content of the picture can be conveyed to the user watching the current playing picture in time, and the information conveying efficiency can be improved;

5) the prompt data is presented first, and the output of the associated data is performed according to the output type of the associated data after the operation of the user on the prompt data, so that the actual output requirement of the user can be met, and the user experience is improved.

The application scenario applied in the embodiment of the present application is as follows: when a user watches a video, the user wants to take notes on a certain playing picture content, and the note content can be recorded by adopting the scheme of the embodiment of the application. When the user watches the video again, the note content which is made aiming at the playing picture can be automatically called and displayed when the picture is played, so that the note content and the picture content based on which the note content is made can be conveniently checked, and the information transmission efficiency is improved.

The application scenario II applied in the embodiment of the application is as follows: when the user A watches the video and plays a certain playing picture, the data of hearts, experiences or comments of other users on the playing picture can be automatically called and displayed, so that the user A can know the idea of the other users on the playing picture, the hearts, experiences or comments of the other users on the playing picture can be timely conveyed to the user watching the playing picture, such as the user A, and the information transmission efficiency is improved. When the user a watches the playing picture again, the user a can see not only the hearts, experiences or comments of other users on the content of the picture, but also the comments or annotations previously generated by the user a on the content of the playing picture.

The technical scheme provided by the embodiment of the application can be regarded as a scheme for quickly marking the content with respect to a certain playing picture when the video is watched, so that the readability of the marked content can be increased. The method and the device can mark the contents of the user in a manner of audio or video when watching the video, so that the marked information is richer, and convenience is provided for subsequent reference or check of the marked information.

An embodiment of the present application provides an apparatus for outputting data, as shown in fig. 13, the apparatus includes: a playback unit 11, a judgment unit 12, an acquisition unit 13, and an output unit 14; wherein,

a playing unit 11 for playing the first video;

a judging unit 12, configured to judge whether at least one piece of associated data related to a currently played picture of the first video can be obtained, and if yes, trigger the obtaining unit;

an obtaining unit 13 configured to obtain at least one associated data related to a currently playing picture of the first video;

an output unit 14, configured to output at least part of the at least one associated data;

wherein, the output unit 14 outputs the at least part of the associated data in an audio or video manner, and the associated data is generated by the viewer aiming at the current playing picture.

In the foregoing aspect, the apparatus includes:

the acquisition unit is used for acquiring at least one audio data and/or at least one video data generated by at least one viewer aiming at the current playing picture;

the first determining unit is used for determining the collected audio data and video data as the related data related to the current playing picture;

and the storage unit is used for correspondingly storing the associated data and the identification information of the current playing picture.

In the foregoing solution, the obtaining unit 13 is configured to obtain identification information of the currently played picture;

In the foregoing aspect, the apparatus includes:

a first detection unit configured to detect a first predetermined operation that the viewer generates with respect to the currently played picture; obtaining an operation attribute of a first preset operation; presenting a collection function key under the condition that the operation attribute meets a preset condition; the acquisition function key is used for acquiring audio data and/or video data generated by a viewer aiming at the current playing picture;

and detecting the operation of the viewer on the acquisition function key, and triggering an acquisition unit to acquire at least one audio data and/or at least one video data generated by the viewer aiming at the current playing picture.

In the foregoing solution, the collection function keys include a first collection function key and a second collection function key; the first acquisition function key is used for acquiring audio data generated by the viewer aiming at the current playing picture; the second acquisition function key is used for acquiring video data generated by the viewer aiming at the current playing picture;

the first detection unit is used for starting the audio acquisition device under the condition that the operation of a first function key is detected, and the audio acquisition device acquires audio data generated by the viewer aiming at the current playing picture;

or,

In the foregoing aspect, the apparatus includes:

the output unit 14 is configured to present prompt data, where the prompt data is prompt data for the at least part of the associated data, and is used to prompt that the at least part of the associated data can be output;

a second detection unit, configured to detect a second predetermined operation for the hint data, where the second predetermined operation is an operation for outputting the at least part of the associated data;

the output unit 14 is configured to output the at least part of the associated data in the output type.

In the foregoing solution, the obtaining unit 13 is configured to obtain each piece of prompt data for each piece of associated data; determining the presentation position of each prompt datum on the display screen; the output unit 14 is configured to display each piece of prompt data at a corresponding presentation position.

The determining Unit 12, the obtaining Unit 13, the first detecting Unit, the second detecting Unit, and the first determining Unit in the device for outputting data in the embodiment of the present application may be implemented by a Central Processing Unit (CPU), a Digital Signal Processor (DSP), a Micro Control Unit (MCU), or a Programmable gate array (FPGA) in the device in practical application. The holding unit may be implemented by a memory. The output unit 14 and the playing unit 11 can be implemented by a speaker or a display.

It should be noted that, in the device for outputting data according to the embodiment of the present application, because the principle of solving the problem of the device for outputting data is similar to that of the method for outputting data, the implementation process and the implementation principle of the device for outputting data can be described by referring to the implementation process and the implementation principle of the method for outputting data, and repeated details are not described again.

Here, it should be noted that: the description of the embodiment of the apparatus for outputting data is similar to the description of the method, and has the same beneficial effects as the embodiment of the method, and therefore, the description is omitted. For technical details that are not disclosed in the embodiment of the apparatus for outputting data of the present invention, those skilled in the art should refer to the description of the embodiment of the method of the present invention to understand that, for the sake of brevity, detailed description is not repeated here.

An embodiment of the present application further provides an apparatus for outputting data, including: one or more processors; a memory communicatively coupled to the one or more processors; one or more application programs; wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method described above.

In a specific example, the apparatus for outputting data according to the embodiment of the present application may be embodied as a structure as shown in fig. 14, and the apparatus for outputting data at least includes a processor 51, a storage medium 52, and at least one external communication interface 53; the processor 51, the storage medium 52 and the external communication interface 53 are all connected by a bus 54. The processor 51 may be a microprocessor, a central processing unit, a digital signal processor, a programmable logic array, or other electronic components with processing functions. The storage medium has stored therein computer executable code capable of performing the method of any of the above embodiments. In practical applications, the determining unit 12, the obtaining unit 13, the first detecting unit, the second detecting unit, and the first determining unit may be implemented by the processor 51.

Embodiments of the present application also provide a computer-readable storage medium, which stores a computer program, and when the program is executed by a processor, the computer program implements the method described above.

A computer-readable storage medium can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable read-only memory (CDROM). Additionally, the computer-readable storage medium may even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that all or part of the steps carried by the method for implementing the above embodiments can be implemented by hardware related to instructions of a program, which can be stored in a computer readable storage medium, and the program includes one or a combination of the steps of the method embodiments when the program is executed.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.

The embodiments described above are only a part of the embodiments of the present invention, and not all of them. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Claims

1. A method of outputting data, the method comprising:

in the case where the first video is played,

outputting at least part of the at least one associated data;

2. The method according to claim 1, wherein before obtaining at least one associated data related to the currently playing picture, the method comprises:

3. The method according to claim 2, wherein said obtaining at least one associated data related to said currently playing picture comprises:

acquiring identification information of the current playing picture;

4. The method of claim 2, further comprising:

obtaining an operation attribute of a first preset operation;

5. The method of claim 4, wherein the capture function keys comprise a first capture function key and a second capture function key; the first acquisition function key is used for acquiring audio data generated by the viewer aiming at the current playing picture; the second acquisition function key is used for acquiring video data generated by the viewer aiming at the current playing picture;

or,

6. The method according to any of claims 1 to 5, wherein after obtaining at least one associated data related to the current play-out picture, the method further comprises:

outputting the at least partially associated data in the output type.

7. The method according to claim 6, wherein in the case that the obtained association data is two or more, the method further comprises:

obtaining respective prompt data for respective associated data;

and displaying each prompt datum at the corresponding presentation position.

8. An apparatus for outputting data, the apparatus comprising:

a playing unit for playing the first video;

9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.

10. An apparatus for outputting data, comprising:

one or more processors;

a memory communicatively coupled to the one or more processors;

one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-7.