How to get all the comments from Disqus?

坚强是说给别人听的谎言 提交于 2019-12-03 07:54:32

问题


I want to fetch all the comments on CNN whose comment system is Disqus. As an example, http://edition.cnn.com/2013/02/25/tech/innovation/google-glass-privacy-andrew-keen/index.html?hpt=hp_c1

The commenting system requires us to click on "load more" so that we can see more comments. I have tried using php to parse the html but it was not able to load all the comments since the javascript is used. So i am wondering if anyone has a more convenient way to retrieve all the comments from a specific cnn url.

Has anyone made it successfully? Thanks in advance


回答1:


The Disqus API contains a pagination method using cursors that are returned in the JSON response. See here for information about cursors: http://disqus.com/api/docs/cursors/

Since you mentioned PHP, something like this should get you started:

<?php
$apikey = '<your key here>'; // get keys at http://disqus.com/api/ — can be public or secret for this endpoint
$shortname = '<the disqus forum shortname>'; // defined in the var disqus_shortname = '...';
$thread = 'link:<URL of thread>'; // IMPORTANT the URL that you're viewing isn't necessarily the one stored with the thread of comments
//$thread = 'ident:<identifier of thread>'; Use this if 'link:' has no results. Defined in 'var disqus_identifier = '...';
$limit = '100'; // max is 100 for this endpoint. 25 is default

$endpoint = 'https://disqus.com/api/3.0/threads/listPosts.json?api_key='.$apikey.'&forum='.$shortname.'&limit='.$limit.'&cursor='.$cursor;

$j=0;
listcomments($endpoint,$cursor,$j);

function listcomments($endpoint,$cursor,$j) {

    // Standard CURL
    $session = curl_init($endpoint.$cursor);
    curl_setopt($session, CURLOPT_RETURNTRANSFER, 1); // instead of just returning true on success, return the result on success
    $data = curl_exec($session);
    curl_close($session);

    // Decode JSON data
    $results = json_decode($data);
    if ($results === NULL) die('Error parsing json');

    // Comment response
    $comments = $results->response;

    // Cursor for pagination
    $cursor = $results->cursor;

    $i=0;
    foreach ($comments as $comment) {
        $name = $comment->author->name;
        $comment = $comment->message;
        $created = $comment->createdAt;
        // Get more data...

        echo "<p>".$name." wrote:<br/>";
        echo $comment."<br/>";
        echo $created."</p>";
        $i++;
    }

    // cursor through until today
    if ($i == 100) {
        $cursor = $cursor->next;
        $i = 0;
        listcomments($endpoint,$cursor);
        /* uncomment to only run $j number of iterations
        $j++;
        if ($j < 10) {
            listcomments($endpoint,$cursor,$j);
        }*/
    }
}

?>



回答2:


Just an addition: to get the url of disqus comments on any page that it's found, run this JavaScript code in the web browser console:

var visit = function () {
var url = document.querySelector('div#disqus_thread iframe').src;

String.prototype.startsWith = function (check) {
    return(this.indexOf(check) == 0);
};

if (!url.startsWith('https://')) return url.slice(0, 4) + "s" + url.slice(4);

return url;
}();

Since the variable is now in 'visit'

console.log(visit);

I helped you to mine all the data into a UTF-8 json format, saved it into .txt and it can be found at this link. The json format contains some variable names but the one you need is the 'data' variable, which is a JavaScript array.

Iterate through each of them and then split them at 'x==x'. The 'x==x' was done to make sure that the userid of those who made the comments where captured too. In a situation where there is no userid in number format but a name, it means that the account is no longer active.

To use the userid, it's a matter of https://disqus.com/users/106222183 where the 106222183 is the userid




回答3:


without api:

#disqus_thread {
  position: relative;
  height: 300px;
  background-color: #fff;
  overflow: hidden;
}
#disqus_thread:after {
  content: "";
  display: block;
  height: 10px;
  width: 100%;
  position: absolute;
  bottom: 0;
  background: white;
}
#disqus_thread.loaded {
  height: auto;
}
#disqus_thread.loaded:after{
    height:55px;
}
#disqus-load {
  text-align: center;
  color: #fff;
  padding: 11px 14px;
  font-size: 13px;
  font-weight: 500;
  display: block;
  text-align: center;
  border: none;
  background: rgba(29,47,58,.6);
  line-height: 1.1;
  border-radius: 3px;
  font-weight: 500;
  transition: background .2s;
  text-shadow: none;
  cursor:pointer;
}

<div class="disqus-comments">
    <div id='disqus_thread'></div>
    <div id='disqus-load'>Load comments</div>
</div>

<script type="text/javascript">


 $(document).ready(function() {
    var disqus_shortname = 'testare-123';

    (function() {
        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
        dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
    })();
        $('#disqus-load').on('click', function(){ 

        $.ajax({
            type: "GET",
            url: "http://" + disqus_shortname + ".disqus.com/embed.js",
            dataType: "script",
            cache: true
        });

        $(this).fadeOut();
        $('#disqus_thread').addClass('loaded');
    });
});
    /* * * CONFIGURATION VARIABLES * * */
    // var disqus_shortname = 'testare-123';

    // /* * * DON'T EDIT BELOW THIS LINE * * */
    // (function() {
    //  var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
    //  dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
    //  (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
    // })();
</script>
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript" rel="nofollow">comments powered by Disqus.</a></noscript>


来源:https://stackoverflow.com/questions/15080258/how-to-get-all-the-comments-from-disqus

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!